From 2e1b427b958a52402fbf0798e66124d57cea08c5 Mon Sep 17 00:00:00 2001 From: Marc Garcia Date: Wed, 21 May 2025 13:31:31 +0200 Subject: [PATCH 1/4] DOC: Restructure and expand UDF page --- .../user_guide/user_defined_functions.rst | 172 +++++++++++++----- 1 file changed, 125 insertions(+), 47 deletions(-) diff --git a/doc/source/user_guide/user_defined_functions.rst b/doc/source/user_guide/user_defined_functions.rst index c2472b3c229db..ffa6ac6e8aa47 100644 --- a/doc/source/user_guide/user_defined_functions.rst +++ b/doc/source/user_guide/user_defined_functions.rst @@ -96,15 +96,15 @@ User-Defined Functions can be applied across various pandas methods: +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`apply` (axis=1) | Row (Series) | Row (Series) | Apply a function to each row | +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ -| :meth:`agg` | Series/DataFrame | Scalar or Series | Aggregate and summarizes values, e.g., sum or custom reducer | +| :meth:`pipe` | Series or DataFrame | Series or DataFrame | Chain functions together to apply to Series or Dataframe | +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ -| :meth:`transform` (axis=0) | Column (Series) | Column(Series) | Same as :meth:`apply` with (axis=0), but it raises an exception if the function changes the shape of the data | +| :meth:`filter` | Series or DataFrame | Boolean | Only accepts UDFs in group by. Function is called for each group, and the group is removed from the result if the function returns ``False`` | +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ -| :meth:`transform` (axis=1) | Row (Series) | Row (Series) | Same as :meth:`apply` with (axis=1), but it raises an exception if the function changes the shape of the data | +| :meth:`agg` | Series or DataFrame | Scalar or Series | Aggregate and summarizes values, e.g., sum or custom reducer | +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ -| :meth:`filter` | Series or DataFrame | Boolean | Only accepts UDFs in group by. Function is called for each group, and the group is removed from the result if the function returns ``False`` | +| :meth:`transform` (axis=0) | Column (Series) | Column (Series) | Same as :meth:`apply` with (axis=0), but it raises an exception if the function changes the shape of the data | +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ -| :meth:`pipe` | Series/DataFrame | Series/DataFrame | Chain functions together to apply to Series or Dataframe | +| :meth:`transform` (axis=1) | Row (Series) | Row (Series) | Same as :meth:`apply` with (axis=1), but it raises an exception if the function changes the shape of the data | +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ When applying UDFs in pandas, it is essential to select the appropriate method based @@ -118,53 +118,108 @@ decisions, ensuring more efficient and maintainable code. and :ref:`ewm()` for details. -:meth:`DataFrame.apply` -~~~~~~~~~~~~~~~~~~~~~~~ +:meth:`Series.map` and :meth:`DataFrame.map` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The :meth:`apply` method allows you to apply UDFs along either rows or columns. While flexible, -it is slower than vectorized operations and should be used only when you need operations -that cannot be achieved with built-in pandas functions. +The :meth:`map` method is used specifically to apply element-wise UDFs. This means the function +will be called for each element in the ``Series`` or ``DataFrame``, with the individual value or +the cell as the function argument. -When to use: :meth:`apply` is suitable when no alternative vectorized method or UDF method is available, -but consider optimizing performance with vectorized operations wherever possible. +.. ipython:: python -:meth:`DataFrame.agg` -~~~~~~~~~~~~~~~~~~~~~ + temperature_celsius = pd.DataFrame({ + "NYC": [14, 21, 23], + "Los Angeles": [22, 28, 31], + }) -If you need to aggregate data, :meth:`agg` is a better choice than apply because it is -specifically designed for aggregation operations. + def to_fahrenheit(value): + return value * (9 / 5) + 32 -When to use: Use :meth:`agg` for performing custom aggregations, where the operation returns -a scalar value on each input. + temperature_celsius.map(to_fahrenheit) -:meth:`DataFrame.transform` -~~~~~~~~~~~~~~~~~~~~~~~~~~~ +In this example, the function ``to_fahrenheit`` will be called 6 times, once for each value +in the ``DataFrame``. And the result of each call will be returned in the corresponding cell +of the resulting ``DataFrame``. -The :meth:`transform` method is ideal for performing element-wise transformations while preserving the shape of the original DataFrame. -It is generally faster than apply because it can take advantage of pandas' internal optimizations. +In general, ``map`` will be slow, as it will not make use of vectorization. Instead, a Python +function call for each value will be required, which will slow down things significantly if +working with medium or large data. -When to use: When you need to perform element-wise transformations that retain the original structure of the DataFrame. +When to use: Use :meth:`map` for applying element-wise UDFs to DataFrames or Series. -.. code-block:: python +:meth:`Series.apply` and :meth:`DataFrame.apply` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - from sklearn.linear_model import LinearRegression +The :meth:`apply` method allows you to apply UDFs for a whole column or row. This is different +from :meth:`map` in that the function will be called for each column (or row), not for each individual value. - df = pd.DataFrame({ - 'group': ['A', 'A', 'A', 'B', 'B', 'B'], - 'x': [1, 2, 3, 1, 2, 3], - 'y': [2, 4, 6, 1, 2, 1.5] - }).set_index("x") +.. ipython:: python - # Function to fit a model to each group - def fit_model(group): - x = group.index.to_frame() - y = group - model = LinearRegression() - model.fit(x, y) - pred = model.predict(x) - return pred + temperature_celsius = pd.DataFrame({ + "NYC": [14, 21, 23], + "Los Angeles": [22, 28, 31], + }) - result = df.groupby('group').transform(fit_model) + def to_fahrenheit(column): + return column * (9 / 5) + 32 + + temperature_celsius.apply(to_fahrenheit) + +In the example, ``to_fahrenheit`` will be called only twice, as opposed to the 6 times with :meth:`map`. +This will be faster than using :meth:`map`, since the operations for each column are vectorized, and the +overhead of iterating over data in Python and calling Python functions is significantly reduced. + +In some cases, the function may require all the data to be able to compute the result. So :meth:`apply` +is needed, since with :meth:`map` the function can only access one element at a time. + +.. ipython:: python + + temperature = pd.DataFrame({ + "NYC": [14, 21, 23], + "Los Angeles": [22, 28, 31], + }) + + def normalize(column): + return column / column.mean() + + temperature.apply(normalize) + +In the example, the ``normalize`` function needs to compute the mean of the whole column in order +to divide each element by it. So, we cannot call the function for each element, but we need the +function to receive the whole column. + +:meth:`apply` can also execute function by row, by specifying ``axis=1``. + +.. ipython:: python + + temperature = pd.DataFrame({ + "NYC": [14, 21, 23], + "Los Angeles": [22, 28, 31], + }) + + def hotter(row): + return row["Los Angeles"] - row["NYC"] + + temperature.apply(hotter, axis=1) + +In the example, the function ``hotter`` will be called 3 times, once for each row. And each +call will receive the whole row as the argument, allowing computations that require more than +one value in the row. + +``apply`` is also available for :meth:`SeriesGroupBy.apply`, :meth:`DataFrameGroupBy.apply`, +:meth:`Rolling.apply`, :meth:`Expanding.apply` and :meth:`Resampler.apply`. You can read more +about ``apply`` in groupby operations :ref:`groupby.apply`. + +When to use: :meth:`apply` is suitable when no alternative vectorized method or UDF method is available, +but consider optimizing performance with vectorized operations wherever possible. + +:meth:`DataFrame.pipe` +~~~~~~~~~~~~~~~~~~~~~~ + +The :meth:`pipe` method is useful for chaining operations together into a clean and readable pipeline. +It is a helpful tool for organizing complex data processing workflows. + +When to use: Use :meth:`pipe` when you need to create a pipeline of operations and want to keep the code readable and maintainable. :meth:`DataFrame.filter` ~~~~~~~~~~~~~~~~~~~~~~~~ @@ -199,20 +254,43 @@ When to use: Use :meth:`filter` when you want to use a UDF to create a subset of Since filter does not directly accept a UDF, you have to apply the UDF indirectly, for example, by using list comprehensions. -:meth:`DataFrame.map` +:meth:`DataFrame.agg` ~~~~~~~~~~~~~~~~~~~~~ -The :meth:`map` method is used specifically to apply element-wise UDFs. +If you need to aggregate data, :meth:`agg` is a better choice than apply because it is +specifically designed for aggregation operations. -When to use: Use :meth:`map` for applying element-wise UDFs to DataFrames or Series. +When to use: Use :meth:`agg` for performing custom aggregations, where the operation returns +a scalar value on each input. -:meth:`DataFrame.pipe` -~~~~~~~~~~~~~~~~~~~~~~ +:meth:`DataFrame.transform` +~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The :meth:`pipe` method is useful for chaining operations together into a clean and readable pipeline. -It is a helpful tool for organizing complex data processing workflows. +The :meth:`transform` method is ideal for performing element-wise transformations while preserving the shape of the original DataFrame. +It is generally faster than apply because it can take advantage of pandas' internal optimizations. -When to use: Use :meth:`pipe` when you need to create a pipeline of operations and want to keep the code readable and maintainable. +When to use: When you need to perform element-wise transformations that retain the original structure of the DataFrame. + +.. code-block:: python + + from sklearn.linear_model import LinearRegression + + df = pd.DataFrame({ + 'group': ['A', 'A', 'A', 'B', 'B', 'B'], + 'x': [1, 2, 3, 1, 2, 3], + 'y': [2, 4, 6, 1, 2, 1.5] + }).set_index("x") + + # Function to fit a model to each group + def fit_model(group): + x = group.index.to_frame() + y = group + model = LinearRegression() + model.fit(x, y) + pred = model.predict(x) + return pred + + result = df.groupby('group').transform(fit_model) Performance From e148685296a2111827d5f8e058a055655d6501d4 Mon Sep 17 00:00:00 2001 From: Marc Garcia Date: Wed, 21 May 2025 18:46:23 +0200 Subject: [PATCH 2/4] Adding examples to all methods --- .../user_guide/user_defined_functions.rst | 204 ++++++++++-------- 1 file changed, 120 insertions(+), 84 deletions(-) diff --git a/doc/source/user_guide/user_defined_functions.rst b/doc/source/user_guide/user_defined_functions.rst index ffa6ac6e8aa47..b6c7f2fabbff2 100644 --- a/doc/source/user_guide/user_defined_functions.rst +++ b/doc/source/user_guide/user_defined_functions.rst @@ -26,20 +26,6 @@ Here’s a simple example to illustrate a UDF applied to a Series: # Apply the function element-wise using .map s.map(add_one) -You can also apply UDFs to an entire DataFrame. For example: - -.. ipython:: python - - df = pd.DataFrame({"A": [1, 2, 3], "B": [10, 20, 30]}) - - # UDF that takes a row and returns the sum of columns A and B - def sum_row(row): - return row["A"] + row["B"] - - # Apply the function row-wise (axis=1 means apply across columns per row) - df.apply(sum_row, axis=1) - - Why Not To Use User-Defined Functions ------------------------------------- @@ -87,25 +73,25 @@ Methods that support User-Defined Functions User-Defined Functions can be applied across various pandas methods: -+----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ -| Method | Function Input | Function Output | Description | ++-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ +| Method | Function Input | Function Output | Description | +============================+========================+==========================+==============================================================================================================================================+ -| :meth:`map` | Scalar | Scalar | Apply a function to each element | -+----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ -| :meth:`apply` (axis=0) | Column (Series) | Column (Series) | Apply a function to each column | -+----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ -| :meth:`apply` (axis=1) | Row (Series) | Row (Series) | Apply a function to each row | -+----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ -| :meth:`pipe` | Series or DataFrame | Series or DataFrame | Chain functions together to apply to Series or Dataframe | -+----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ -| :meth:`filter` | Series or DataFrame | Boolean | Only accepts UDFs in group by. Function is called for each group, and the group is removed from the result if the function returns ``False`` | -+----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ -| :meth:`agg` | Series or DataFrame | Scalar or Series | Aggregate and summarizes values, e.g., sum or custom reducer | -+----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ -| :meth:`transform` (axis=0) | Column (Series) | Column (Series) | Same as :meth:`apply` with (axis=0), but it raises an exception if the function changes the shape of the data | -+----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ -| :meth:`transform` (axis=1) | Row (Series) | Row (Series) | Same as :meth:`apply` with (axis=1), but it raises an exception if the function changes the shape of the data | -+----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ +| :ref:`udf.map` | Scalar | Scalar | Apply a function to each element | ++-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ +| :ref:`udf.apply` (axis=0) | Column (Series) | Column (Series) | Apply a function to each column | ++-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ +| :ref:`udf.apply` (axis=1) | Row (Series) | Row (Series) | Apply a function to each row | ++-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ +| :ref:`udf.pipe` | Series or DataFrame | Series or DataFrame | Chain functions together to apply to Series or Dataframe | ++-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ +| :ref:`udf.filter` | Series or DataFrame | Boolean | Only accepts UDFs in group by. Function is called for each group, and the group is removed from the result if the function returns ``False`` | ++-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ +| :ref:`udf.agg` | Series or DataFrame | Scalar or Series | Aggregate and summarizes values, e.g., sum or custom reducer | ++-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ +| :ref:`udf.transform` (axis=0) | Column (Series) | Column (Series) | Same as :meth:`apply` with (axis=0), but it raises an exception if the function changes the shape of the data | ++-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ +| :ref:`udf.transform` (axis=1) | Row (Series) | Row (Series) | Same as :meth:`apply` with (axis=1), but it raises an exception if the function changes the shape of the data | ++-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ When applying UDFs in pandas, it is essential to select the appropriate method based on your specific task. Each method has its strengths and is designed for different use @@ -118,6 +104,8 @@ decisions, ensuring more efficient and maintainable code. and :ref:`ewm()` for details. +.. _udf.map: + :meth:`Series.map` and :meth:`DataFrame.map` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -147,6 +135,8 @@ working with medium or large data. When to use: Use :meth:`map` for applying element-wise UDFs to DataFrames or Series. +.. _udf.apply: + :meth:`Series.apply` and :meth:`DataFrame.apply` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -213,84 +203,130 @@ about ``apply`` in groupby operations :ref:`groupby.apply`. When to use: :meth:`apply` is suitable when no alternative vectorized method or UDF method is available, but consider optimizing performance with vectorized operations wherever possible. -:meth:`DataFrame.pipe` -~~~~~~~~~~~~~~~~~~~~~~ +.. _udf.pipe: -The :meth:`pipe` method is useful for chaining operations together into a clean and readable pipeline. -It is a helpful tool for organizing complex data processing workflows. +:meth:`Series.pipe` and :meth:`DataFrame.pipe` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When to use: Use :meth:`pipe` when you need to create a pipeline of operations and want to keep the code readable and maintainable. +The ``pipe`` method is similar to ``map`` and ``apply``, but the function receives the whole ``Series`` +or ``DataFrame`` it is called on. + +.. ipython:: python + + temperature = pd.DataFrame({ + "NYC": [14, 21, 23], + "Los Angeles": [22, 28, 31], + }) -:meth:`DataFrame.filter` -~~~~~~~~~~~~~~~~~~~~~~~~ + def normalize(df): + return df / df.mean().mean() -The :meth:`filter` method is used to select subsets of the DataFrame’s -columns or row. It is useful when you want to extract specific columns or rows that -match particular conditions. + temperature.pipe(normalize) -When to use: Use :meth:`filter` when you want to use a UDF to create a subset of a DataFrame or Series +This is equivalent to calling the ``normalize`` function with the ``DataFrame`` as the parameter. -.. note:: - :meth:`DataFrame.filter` does not accept UDFs, but can accept - list comprehensions that have UDFs applied to them. +.. ipython:: python + + normalize(temperature) + +The main advantage of using ``pipe`` is readability. It allows method chaining and clearer code when +calling multiple functions. .. ipython:: python - # Sample DataFrame - df = pd.DataFrame({ - 'AA': [1, 2, 3], - 'BB': [4, 5, 6], - 'C': [7, 8, 9], - 'D': [10, 11, 12] + temperature_celsius = pd.DataFrame({ + "NYC": [14, 21, 23], + "Los Angeles": [22, 28, 31], }) - # Function that filters out columns where the name is longer than 1 character - def is_long_name(column_name): - return len(column_name) > 1 + def multiply_by_9(value): + return value * 9 - df_filtered = df.filter(items=[col for col in df.columns if is_long_name(col)]) - print(df_filtered) + def divide_by_5(value): + return value / 5 -Since filter does not directly accept a UDF, you have to apply the UDF indirectly, -for example, by using list comprehensions. + def add_32(value): + return value + 32 -:meth:`DataFrame.agg` -~~~~~~~~~~~~~~~~~~~~~ + # Without `pipe`: + fahrenheit = add_32(divide_by_5(multiply_by_9(temperature_celsius))) + + # With `pipe`: + fahrenheit = (temperature_celsius.pipe(multiply_by_9) + .pipe(divide_by_5) + .pipe(add_32)) + +``pipe`` is also available for :meth:`SeriesGroupBy.pipe`, :meth:`DataFrameGroupBy.pipe` and +:meth:`Resampler.pipe`. You can read more about ``pipe`` in groupby operations in :ref:`groupby.pipe`. + +When to use: Use :meth:`pipe` when you need to create a pipeline of operations and want to keep the code readable and maintainable. + +.. _udf.filter: + +:meth:`Series.filter` and :meth:`DataFrame.filter` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``filter`` method is used to select a subset of rows that match certain criteria. +:meth:`Series.filter` and :meth:`DataFrame.filter` do not support user defined functions, +but :meth:`SeriesGroupBy.filter` and :meth:`DataFrameGroupBy.filter` do. You can read more +about ``filter`` in groupby operations in :ref:`groupby.filter`. + +.. _udf.agg: + +:meth:`Series.agg` and :meth:`DataFrame.agg` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``agg`` method is used to aggregate a set of data points into a single one. +The most common aggregation functions such as ``min``, ``max``, ``mean``, ``sum``, etc. +are already implemented in pandas. ``agg`` allows to implement other custom aggregate +functions. + +.. ipython:: python + + temperature = pd.DataFrame({ + "NYC": [14, 21, 23], + "Los Angeles": [22, 28, 31], + }) + + def highest_jump(column): + return column.pct_change().max() + + temperature.apply(highest_jump) -If you need to aggregate data, :meth:`agg` is a better choice than apply because it is -specifically designed for aggregation operations. When to use: Use :meth:`agg` for performing custom aggregations, where the operation returns a scalar value on each input. -:meth:`DataFrame.transform` -~~~~~~~~~~~~~~~~~~~~~~~~~~~ +.. _udf.transform: -The :meth:`transform` method is ideal for performing element-wise transformations while preserving the shape of the original DataFrame. -It is generally faster than apply because it can take advantage of pandas' internal optimizations. +:meth:`Series.transform` and :meth:`DataFrame.transform` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When to use: When you need to perform element-wise transformations that retain the original structure of the DataFrame. +The ``transform``` method is similar to an aggregation, with the difference that the result is broadcasted +to the original data. -.. code-block:: python +.. ipython:: python - from sklearn.linear_model import LinearRegression + temperature = pd.DataFrame({ + "NYC": [14, 21, 23], + "Los Angeles": [22, 28, 31]}, + index=pd.date_range("2000-01-01", "2000-01-03")) - df = pd.DataFrame({ - 'group': ['A', 'A', 'A', 'B', 'B', 'B'], - 'x': [1, 2, 3, 1, 2, 3], - 'y': [2, 4, 6, 1, 2, 1.5] - }).set_index("x") + def warm_up_all_days(column): + return pd.Series(column.max(), index=column.index) - # Function to fit a model to each group - def fit_model(group): - x = group.index.to_frame() - y = group - model = LinearRegression() - model.fit(x, y) - pred = model.predict(x) - return pred + temperature.transform(warm_up_all_days) + +In the example, the ``warm_up_all_days`` function computes the ``max`` like an aggregation, but instead +of returning just the maximum value, it returns a ``DataFrame`` with the same shape as the original one +with the values of each day replaced by the the maximum temperature of the city. + +``transform`` is also available for :meth:`SeriesGroupBy.transform`, :meth:`DataFrameGroupBy.transform` and +:meth:`Resampler.transform`, where it's more common. You can read more about ``transform`` in groupby +operations in :ref:`groupby.transform`. - result = df.groupby('group').transform(fit_model) +When to use: When you need to perform an aggregation that will be returned in the original structure of +the DataFrame. Performance From 32cd67dff13624762ccf3d92dd74f4b8cdbfa195 Mon Sep 17 00:00:00 2001 From: Marc Garcia Date: Wed, 21 May 2025 19:05:53 +0200 Subject: [PATCH 3/4] Fix table --- doc/source/user_guide/user_defined_functions.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/source/user_guide/user_defined_functions.rst b/doc/source/user_guide/user_defined_functions.rst index b6c7f2fabbff2..03e46d4e7fa1a 100644 --- a/doc/source/user_guide/user_defined_functions.rst +++ b/doc/source/user_guide/user_defined_functions.rst @@ -75,7 +75,7 @@ User-Defined Functions can be applied across various pandas methods: +-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ | Method | Function Input | Function Output | Description | -+============================+========================+==========================+==============================================================================================================================================+ ++===============================+========================+==========================+==============================================================================================================================================+ | :ref:`udf.map` | Scalar | Scalar | Apply a function to each element | +-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+ | :ref:`udf.apply` (axis=0) | Column (Series) | Column (Series) | Apply a function to each column | From c13b8198629bd1208805469c11bf7e575c54021f Mon Sep 17 00:00:00 2001 From: Marc Garcia Date: Thu, 22 May 2025 18:13:44 +0200 Subject: [PATCH 4/4] Update label and change typo in example --- doc/source/user_guide/user_defined_functions.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/source/user_guide/user_defined_functions.rst b/doc/source/user_guide/user_defined_functions.rst index 03e46d4e7fa1a..f24a71dd690f3 100644 --- a/doc/source/user_guide/user_defined_functions.rst +++ b/doc/source/user_guide/user_defined_functions.rst @@ -1,4 +1,4 @@ -.. _user_defined_functions: +.. _udf: {{ header }} @@ -291,7 +291,7 @@ functions. def highest_jump(column): return column.pct_change().max() - temperature.apply(highest_jump) + temperature.agg(highest_jump) When to use: Use :meth:`agg` for performing custom aggregations, where the operation returns