1

The code issuing the SettingWithCopyWarning can be fully replicated. See the edit below for more strange behaviour wrt consistency of getting the warning. FYI, this is not a duplicate of these 3 questions:

  1. python astype(str) gives SettingWithCopyWarning and requests I use loc
    • "At some point prior to the assignment, you created df in such a way that it became a view into another dataframe."
    • Mine can be created from scratch with the code below, and not from another df.
  2. Python - Getting “SettingWithCopyWarning” despite using df.loc
    • "Create the 2nd dataframe with the following code..."
    • I'm only creating and using one DataFrame.
  3. How to deal with SettingWithCopyWarning in Pandas
    • I am using .loc but I'm assigning to a column slice; isn't covered in the answers there.

The code:

df = pd.DataFrame({
    'Fruits': ['apple', 'orange', 'banana', 'pineapple', 'watermelon'],
    'Volume 1': [10, 20, 30, 700, 800],
    'Volume 2': [100, 200, 300, 7000, 8000],
    'Volume 3': [1, 2, 3, 70, 8],
    'Volume 4': [9, 8, 7, 6, 5],
    'Other': [-1, -2, -3, -4, -5]
})

df.loc[:, 'Volume 2':'Volume 4'] = df.loc[:, 'Volume 2':'Volume 4'].shift(1)
df = df.dropna()
# All okay so far, no warnings


# This next line issues a `SettingWithCopyWarning`:
df.loc[:, 'Volume 2':'Volume 4'] = df.loc[:, 'Volume 2':'Volume 4'].astype(int)  # warning
# and if I execute this line again, it now issues SWC warning:
df.loc[:, 'Volume 2':'Volume 4'] = df.loc[:, 'Volume 2':'Volume 4'].shift(1)  # warning

# And any following operation now issues the SWCW:
df['A'] = 'foo'  # new column
df['Fruits'] = 'a fruit'  # existing column

I can get rid of the warning by doing either one of:

df = df.dropna()
# or
df = df.copy()

If I then redo different versions of the previous astype() step, there's no warning:

df.loc[:, 'Volume 2':'Volume 4'] = df.loc[:, 'Volume 2':'Volume 4'].astype(float)  # no warning
df.loc[:, 'Volume 2':'Volume 4'] = df.loc[:, 'Volume 2':'Volume 4'].astype(int)  # no warning

If the first astype() step issues SettingWithCopyWarning, why didn't these? Shouldn't it be consistent? I'm on pandas version '1.2.2'.

Btw, even if I could explicitly list all the columns I was converting in the first astype() step, in these different ways, they also issue SWC warnings:

df.loc[:, 'Volume 2':'Volume 4'] = df[['Volume 2', 'Volume 3', 'Volume 4']].astype(int)
df[['Volume 2', 'Volume 3', 'Volume 4']] = df.loc[:, 'Volume 2':'Volume 4'].astype(int)
df[['Volume 2', 'Volume 3', 'Volume 4']] = df[['Volume 2', 'Volume 3', 'Volume 4']].astype(int)

Edit/Update: In JupyterLab and IPython, if dropna() is executed in the same cell as the shift() step, then no warning is issued. So this might be kernel-related. Here's a pastebin link to the IPython code.

aneroid
  • 12,983
  • 3
  • 36
  • 66
  • 1
    Hmmm, hard question. But if using `.copy()` after each filter then it should working well. If not, I agree, sometimes warning, sometimes not. – jezrael Feb 23 '21 at 08:29
  • @jezrael **_How_** did that work? Grrr! :-) Could you explain or point me to some documentation about that? I've gone through "[Returning a view versus a copy](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy)" already; but to me, that doesn't explain why `df = df.dropna()` is the cause. Seems like the re-assigned `df` points to an in-memory-only copy of itself which has no other _variable_-based references. Otherwise, I'd have to make `df.dropna(inplace=True)` my default way of using it (which also fixes the issue). – aneroid Feb 23 '21 at 09:18
  • 1
    Hmm, because filtering, so same possible warning with `df.drop_duplicates()` like `df.dropna()` or `df.query('col > 100')` or `df[df['col'] > 100]` – jezrael Feb 23 '21 at 09:21
  • Ah ok, so it has nothing to do with _chained indexing_ or `__getitem__(...).__setitem__(...)` as described in the docs? Didn't realise _filtering_ operations would cause that problem if re-assigning back to the original `df` or whichever variable. Would that mean, in general, when trimming down a dataframe while filtering, one should always re-assign with `.copy()` ? – aneroid Feb 23 '21 at 09:27
  • 1
    `Would that mean, in general, when trimming down a dataframe while filtering, one should always re-assign with .copy()` Yes, agree. Then this should always prevent this warning. – jezrael Feb 23 '21 at 09:28
  • if you ignore the astype(int) for a bit , do you still get the warning? I am guessing this might be due to dtypes mismatch. But yes , agree on the `copy()` from prev comments. – anky Feb 23 '21 at 14:24
  • @anky after the first `df = df.dropna()`, if I don't do the `astype` step and directly do `df['A'] = 'bar'`, I still get the SWC warning. – aneroid Feb 23 '21 at 17:03
  • @jezrael If I do the `df = df.dropna()` step _twice_ before doing `astype`, then I don't get the SWC warning. The 'double dropna filtering' somehow stops the warning; or makes it not a copy of the original. – aneroid Feb 23 '21 at 17:06

0 Answers0