modin.pandas.DataFrame.sort_values¶

DataFrame.sort_values(by, axis=0, ascending=True, inplace: bool = False, kind='quicksort', na_position='last', ignore_index: bool = False, key: IndexKeyFunc | None = None)[source]¶

Sort by the values along either axis.

Parameters:
  • by (str or list of str) – Name or list of names to sort by. - if axis is 0 or ‘index’ then by may contain index levels and/or column labels. - if axis is 1 or ‘columns’ then by may contain column levels and/or index labels.

  • axis ({0 or ‘index’, 1 or ‘columns’}, default 0) – Axis to be sorted.

  • ascending (bool or list of bool, default True) – Sort ascending vs. descending. Specify list for multiple sort orders. If this is a list of bools, must match the length of the by.

  • inplace (bool, default False) – If True, perform operation in-place.

  • kind ({'quicksort', 'mergesort', 'heapsort', 'stable'} default 'None') – Choice of sorting algorithm. By default, Snowpark Pandaas performs unstable sort. Please use ‘stable’ to perform stable sort. Other choices ‘quicksort’, ‘mergesort’ and ‘heapsort’ are ignored.

  • na_position ({'first', 'last'}, default 'last') – Puts NaNs at the beginning if first; last puts NaNs at the end.

  • ignore_index (bool, default False) – If True, the resulting axis will be labeled 0, 1, …, n - 1.

  • key (callable, optional) – Apply the key function to the values before sorting. This is similar to the key argument in the builtin sorted() function, with the notable difference that this key function should be vectorized. It should expect a Series and return a Series with the same shape as the input. It will be applied to each column in by independently.

Returns:

DataFrame with sorted values or None if inplace=True.

Return type:

DataFrame or None

Notes

Snowpark pandas API doesn’t currently support distributed computation of sort_values when ‘key’ argument is provided or frame is sorted on ‘columns’ axis.

See also

DataFrame.sort_index

Sort a DataFrame by the index.

Series.sort_values

Similar method for a Series.

Examples

>>> df = pd.DataFrame({
...     'col1': ['A', 'A', 'B', np.nan, 'D', 'C'],
...     'col2': [2, 1, 9, 8, 7, 4],
...     'col3': [0, 1, 9, 4, 2, 3],
...     'col4': ['a', 'B', 'c', 'D', 'e', 'F']
... })
>>> df
   col1  col2  col3 col4
0     A     2     0    a
1     A     1     1    B
2     B     9     9    c
3  None     8     4    D
4     D     7     2    e
5     C     4     3    F
Copy

Sort by col1

>>> df.sort_values(by=['col1'])
   col1  col2  col3 col4
0     A     2     0    a
1     A     1     1    B
2     B     9     9    c
5     C     4     3    F
4     D     7     2    e
3  None     8     4    D
Copy

Sort by multiple columns

>>> df.sort_values(by=['col1', 'col2'])
   col1  col2  col3 col4
1     A     1     1    B
0     A     2     0    a
2     B     9     9    c
5     C     4     3    F
4     D     7     2    e
3  None     8     4    D
Copy

Sort Descending

>>> df.sort_values(by='col1', ascending=False)
   col1  col2  col3 col4
4     D     7     2    e
5     C     4     3    F
2     B     9     9    c
0     A     2     0    a
1     A     1     1    B
3  None     8     4    D
Copy

Putting NAs first

>>> df.sort_values(by='col1', ascending=False, na_position='first')
   col1  col2  col3 col4
3  None     8     4    D
4     D     7     2    e
5     C     4     3    F
2     B     9     9    c
0     A     2     0    a
1     A     1     1    B
Copy