w3resource

Pandas DataFrame: query() function

DataFrame - query() function

The query() function is used to query the columns of a DataFrame with a boolean expression.

Syntax:

DataFrame.query(self, expr, inplace=False, **kwargs)

Parameters:

Name Description Type/Default Value Required / Optional
expr 

The query string to evaluate. You can refer to variables in the environment by prefixing them with an ‘@’ character like @a + b.
New in version 0.25.0.
You can refer to column names that contain spaces by surrounding them in backticks.
For example, if one of your columns is called a a and you want to sum it with b, your query should be `a a` + b.

str Required
inplace  Whether the query should modify the data in place or return a modified copy. bool Required
**kwargs See the documentation for eval() for complete details on the keyword arguments accepted by DataFrame.query()   Required

Returns: DataFrame
DataFrame resulting from the provided query expression.

Notes:

The result of the evaluation of this expression is first passed to DataFrame.loc and if that fails because of a multidimensional key (e.g., a DataFrame) then the result will be passed to DataFrame.__getitem__().

This method uses the top-level eval() function to evaluate the passed query.

The query() method uses a slightly modified Python syntax by default. For example, the & and | (bitwise) operators have the precedence of their boolean cousins, and and or. This is syntactically valid Python, however the semantics are different.

You can change the semantics of the expression by passing the keyword argument parser='python'. This enforces the same semantics as evaluation in Python space. Likewise, you can pass engine='python' to evaluate an expression using Python itself as a backend. This is not recommended as it is inefficient compared to using numexpr as the engine.

The DataFrame.index and DataFrame.columns attributes of the DataFrame instance are placed in the query namespace by default, which allows you to treat both the index and columns of the frame as a column in the frame. The identifier index is used for the frame index; you can also use the name of the index to identify it in a query. Please note that Python keywords may not be used as identifiers.

Example:


Download the Pandas DataFrame Notebooks from here.

Previous: DataFrame - mask() function
Next: DataFrame - add() function



Follow us on Facebook and Twitter for latest update.