PNDSPY1012

Message pandas.core.frame.DataFrame.query has a partial mapping because there is an unsupported scenario in Snowpark pandas.

Category Warning

Description

This issue appears when the SMA detects the use of pandas.core.frame.DataFrame.query. This method is commonly used for filtering data in pandas DataFrames, but Snowpark pandas currently has limitations in supporting it. Specifically, it doesn’t support DataFrames that have a row MultiIndex, which can lead to compatibility issues during migration or execution.

Scenario

Using query() with a row MultiIndex.

Input

The following example shows how query() behaves with a row MultiIndex.

import modin.pandas as pd

data = {
    'name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank'],
    'age': [25, 30, 35, 28, 32, 45],
    'salary': [50000, 60000, 75000, 55000, 80000, 90000],
    'department': ['Sales', 'IT', 'HR', 'Sales', 'IT', 'HR']
}

df = pd.DataFrame(data)

df = df.set_index('name')

print("DataFrame with single-level index:")
print(df)

result = df.query("age > 30 and salary < 85000")


data = {
    'A': [1, 2, 3, 4, 5, 6],
    'B': [10, 20, 30, 40, 50, 60],
    'C': ['x', 'y', 'x', 'y', 'x', 'y']
}

df = pd.DataFrame(data)

df = df.set_index([
    pd.Index(['group1', 'group1', 'group2', 'group2', 'group3', 'group3']),
    pd.Index(['a', 'b', 'a', 'b', 'a', 'b'])
])
df.index.names = ['group', 'subgroup']

result = df.query("A > 2 and B < 55")
Copy

Output

The SMA adds the EWI PNDSPY1012 to the output code to indicate that it has a scenario not supported in Snowpark pandas.

from snowflake.snowpark.modin import plugin
import modin.pandas as pd

data = {
    'name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank'],
    'age': [25, 30, 35, 28, 32, 45],
    'salary': [50000, 60000, 75000, 55000, 80000, 90000],
    'department': ['Sales', 'IT', 'HR', 'Sales', 'IT', 'HR']
}

df = pd.DataFrame(data)

df = df.set_index('name')

print("DataFrame with single-level index:")
print(df)

#EWI: PNDSPY1012 => pandas.core.frame.DataFrame.query does not support DataFrames that have a row MultiIndex. Check Snowpark pandas documentation for more detail.
result = df.query("age > 30 and salary < 85000")


data = {
    'A': [1, 2, 3, 4, 5, 6],
    'B': [10, 20, 30, 40, 50, 60],
    'C': ['x', 'y', 'x', 'y', 'x', 'y']
}

df = pd.DataFrame(data)

df = df.set_index([
    pd.Index(['group1', 'group1', 'group2', 'group2', 'group3', 'group3']),
    pd.Index(['a', 'b', 'a', 'b', 'a', 'b'])
])
df.index.names = ['group', 'subgroup']

#EWI: PNDSPY1012 => pandas.core.frame.DataFrame.query does not support DataFrames that have a row MultiIndex. Check Snowpark pandas documentation for more detail.
result = df.query("A > 2 and B < 55")
Copy