Pandas 2.x Migration Guide for Energyworx

This guide helps you migrate your custom Flow rules and Market Adapters from Pandas 1.x to Pandas 2.x. It covers every breaking change relevant to Energyworx code, with before/after examples and search patterns to find affected code.

Cross-Version Compatibility Required

The Energyworx platform is transitioning from Pandas 1.x to 2.x. During this period, your code must work with both Pandas 1.x and 2.x. This guide marks each change with its compatibility:

Safe now: The new syntax works in both Pandas 1.x and 2.x (e.g., "h" instead of "H", pd.concat() instead of .append())
2.2+ only: The new syntax only works in Pandas 2.2 or later (e.g., "ME" instead of "M"). Do not use these yet — keep the old syntax until the platform fully migrates.

Quick Reference: What Changed
Removed Methods and Parameters
Frequency Alias Changes
Timezone and Datetime Changes
DataFrame Operation Changes
Timedelta Handling
Migrating Flow Rules
Migrating Market Adapters
Error Message Reference
Search Patterns for Your Code
Testing Your Migration

1. Quick Reference: What Changed

Category	Severity	Compat	Summary
`DataFrame.append()` removed	Error	Safe now	Use `pd.concat()` instead
`pd.date_range(closed=)` removed	Error	Safe now	Use `inclusive=` parameter
Frequency aliases (`"H"`, `"T"`, `"S"`)	Warning	Safe now	Lowercase required: `"h"`, `"min"`, `"s"`
Frequency aliases (`"M"`, `"Q"`, `"Y"`)	Warning	2.2+ only	New suffixes `"ME"`, `"QE"`, `"YE"` — keep old syntax for now
`pd.Timedelta("100y")` / `pd.Timedelta("6M")`	Error	Safe now	Use days or `DateOffset`
`Index.get_loc(method=)` removed	Error	Safe now	Use `get_indexer()` instead
`.ix[]` accessor removed	Error	Safe now	Use `.loc[]` or `.iloc[]`
`ExcelWriter.save()` removed	Error	Safe now	Use `.close()`
`is_monotonic` removed	Error	Safe now	Use `is_monotonic_increasing`
`infer_datetime_format` parameter removed	Error	Safe now	Remove the parameter
`pd.np` removed (e.g., `pd.np.nan`)	Error	Safe now	Use `np.nan` with `import numpy as np`
`pd.to_datetime()` stricter format inference	Silent	See notes	Use fallback chain for cross-version compat
`value_counts().reset_index()` columns renamed	Silent	Safe now	Use positional column access
`groupby([col])` key type changed	Silent	Safe now	Remove list wrapper for single column
`.columns & list` deprecated	Warning	Safe now	Use `.intersection()`
Mixed-type DataFrame operations stricter	Error	Safe now	Select numeric columns first
Timezone-naive/aware mixing	Error	Safe now	Always localize timestamps
`inplace=True` deprecated	Warning	Safe now	Use assignment instead
Copy-on-Write behavior	Silent	Safe now	Avoid chained indexing

2. Removed Methods and Parameters

2.1 `DataFrame.append()` and `Series.append()` Removed

The .append() method has been removed from DataFrames, Series, and Index objects. Use pd.concat() instead.

# BEFORE (Pandas 1.x)
df = df.append(other_df)
df = df.append(other_df, ignore_index=True)
series = series.append(other_series)

# AFTER (Pandas 2.x)
df = pd.concat([df, other_df])
df = pd.concat([df, other_df], ignore_index=True)
series = pd.concat([series, other_series])

Notes:

pd.concat() returns a new object — it does not modify in place.
Always wrap the objects in a list: [df1, df2].
For appending a single row as a dict, use pd.concat([df, pd.DataFrame([row_dict])]).

Search pattern: \.append\( (then verify it's on a DataFrame/Series, not a Python list)

2.2 `pd.date_range(closed=)` Removed

The closed parameter in pd.date_range() has been replaced with inclusive.

note

This change only applies to pd.date_range(). Other methods like IntervalIndex.from_breaks(), IntervalIndex.from_arrays(), and pd.cut() still use the closed parameter in Pandas 2.x.

# BEFORE (Pandas 1.x)
pd.date_range(start, end, freq="h", closed="right")
pd.date_range(start, end, freq="h", closed="left")
pd.date_range(start, end, freq="h", closed=None)

# AFTER (Pandas 2.x)
pd.date_range(start, end, freq="h", inclusive="right")
pd.date_range(start, end, freq="h", inclusive="left")
pd.date_range(start, end, freq="h", inclusive="both")

Mapping:

Old (`closed=`)	New (`inclusive=`)	Meaning
`closed=None`	`inclusive="both"`	Include both start and end
`closed="left"`	`inclusive="left"`	Include start, exclude end
`closed="right"`	`inclusive="right"`	Exclude start, include end

Search pattern: date_range\([^)]*closed\s*=

2.3 `Index.get_loc(method=)` Removed

The method parameter has been removed from Index.get_loc(). Use Index.get_indexer() instead.

# BEFORE (Pandas 1.x)
idx = df.index.get_loc(date, method="nearest")

# AFTER (Pandas 2.x)
idx = df.index.get_indexer([date], method="nearest")[0]

Search pattern: \.get_loc\([^)]*method\s*=

2.4 `.ix[]` Accessor Removed

The .ix[] accessor was removed. Use .loc[] (label-based) or .iloc[] (position-based) instead.

# BEFORE (Pandas 1.x)
value = df.ix[row_label]
value = df.ix[0]

# AFTER (Pandas 2.x)
value = df.loc[row_label]    # by label
value = df.iloc[0]           # by position

Search pattern: \.ix\[

2.5 `ExcelWriter.save()` Removed

# BEFORE (Pandas 1.x)
writer = pd.ExcelWriter(output, engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()

# AFTER (Pandas 2.x)
writer = pd.ExcelWriter(output, engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.close()

Search pattern: writer\.save\(\)

2.6 `is_monotonic` Removed

# BEFORE (Pandas 1.x)
df.index.is_monotonic

# AFTER (Pandas 2.x)
df.index.is_monotonic_increasing

Search pattern: \.is_monotonic(?!_)

2.7 `infer_datetime_format` Parameter Removed

The infer_datetime_format parameter has been removed from pd.to_datetime() and pd.read_csv(). Pandas 2.x infers the format automatically — simply remove the parameter.

# BEFORE (Pandas 1.x)
pd.to_datetime(series, infer_datetime_format=True)
pd.read_csv(file, parse_dates=True, infer_datetime_format=True)

# AFTER (Pandas 2.x)
pd.to_datetime(series)
pd.read_csv(file, parse_dates=True)

Search pattern: infer_datetime_format

2.8 `pd.np` Removed

The pd.np alias for numpy has been removed. Use import numpy as np and reference np directly.

# BEFORE (Pandas 1.x)
import pandas as pd
df.replace({pd.np.nan: ''})
value = pd.np.nan

# AFTER (Pandas 2.x)
import pandas as pd
import numpy as np
df.replace({np.nan: ''})
value = np.nan

Notes:

This is very common in Market Adapters that use pd.np.nan as a sentinel value.
The fix is straightforward: add import numpy as np and replace pd.np with np.

Search pattern: pd\.np\.

3. Frequency Alias Changes

Pandas 2.2 deprecated many frequency alias strings. These currently raise FutureWarning but will become errors in a future version.

3.1 Safe to Change Now (works in both Pandas 1.x and 2.x)

These lowercase aliases are accepted by both Pandas 1.x and 2.x. Change these now:

Old Alias	New Alias	Meaning	Affects
`"H"`	`"h"`	Hour	`resample()`, `date_range()`, `Grouper()`, `Timedelta()`
`"T"`	`"min"`	Minute	Same
`"S"`	`"s"`	Second	Same
`"L"`	`"ms"`	Millisecond	Same
`"U"`	`"us"`	Microsecond	Same
`"N"`	`"ns"`	Nanosecond	Same

3.2 Do NOT Change Yet (only valid in Pandas 2.2+)

These new aliases ("ME", "QE", "YE", etc.) are not recognized by Pandas versions before 2.2 and will raise ValueError: Invalid frequency. Since the platform must support both Pandas 1.x and 2.x, keep the old aliases for now. They emit a FutureWarning in 2.2+ but still work.

Old Alias	Future Alias	Meaning	Action
`"M"`	`"ME"`	Month End	Keep `"M"` for now
`"Q"`	`"QE"`	Quarter End	Keep `"Q"` for now
`"Y"` or `"A"`	`"YE"`	Year End	Keep `"Y"` for now
`"BM"`	`"BME"`	Business Month End	Keep `"BM"` for now
`"BQ"`	`"BQE"`	Business Quarter End	Keep `"BQ"` for now
`"BA"`	`"BYE"`	Business Year End	Keep `"BA"` for now
`"AS"`	`"YS"`	Year Start	Keep `"AS"` for now
`"BAS"`	`"BYS"`	Business Year Start	Keep `"BAS"` for now

Aliases that are still valid (no change needed): "D" (day), "W" (week), "MS" (month start), "QS" (quarter start), "B" (business day).

3.3 Common Energyworx Examples

# BEFORE (Pandas 1.x)
df.resample("H").sum()
df.resample("15T").mean()
pd.date_range(start, end, freq="1H")
pd.Grouper(freq="1H")

# AFTER (safe for both Pandas 1.x and 2.x)
df.resample("h").sum()
df.resample("15min").mean()
pd.date_range(start, end, freq="1h")
pd.Grouper(freq="1h")

# NOTE: Keep "M" for now — "ME" only works in Pandas 2.2+
pd.Grouper(freq="M")        # keep as-is (will emit FutureWarning in 2.2+)
df.resample("M").sum()       # keep as-is

Compound frequencies: When a number precedes the alias, update only the letter part:

"1H" → "1h"
"15T" → "15min"
"30S" → "30s"
"100L" → "100ms"

Search patterns:

Hour: freq\s*=\s*["'][^"']*H["'] or resample\(\s*["'][^"']*H["']
Minute: freq\s*=\s*["'][^"']*T["']
Second: freq\s*=\s*["'][^"']*[0-9]S["'] (careful: "MS" is valid)
Month end: freq\s*=\s*["']M["'] (exactly "M", not "MS" or "ME")
Year end: freq\s*=\s*["'][^"']*[AY]["']

4. Timezone and Datetime Changes

4.1 Cannot Mix Timezone-Naive and Timezone-Aware

Pandas 2.x strictly rejects operations that mix timezone-naive and timezone-aware datetime objects. This is especially important in Energyworx because self.flow_timestamp is timezone-naive (even though it represents UTC).

# BEFORE (Pandas 1.x) — worked but was technically incorrect
start = pd.Timestamp("2024-01-01")  # naive
df = self.dataframe  # has UTC DatetimeIndex
result = df.loc[start:]  # worked implicitly

# AFTER (Pandas 2.x) — must match timezone
start = pd.Timestamp("2024-01-01", tz="UTC")
df = self.dataframe
result = df.loc[start:]

Common fix for self.flow_timestamp:

# BEFORE
timestamp = pd.Timestamp(self.flow_timestamp)

# AFTER
timestamp = pd.Timestamp(self.flow_timestamp, tz="UTC")

Common fix for computed timestamps:

# BEFORE
edit_date = pd.Timestamp(start_date) + pd.Timedelta(hours=1)

# AFTER — localize AFTER arithmetic, or localize the input
edit_date = (pd.Timestamp(start_date) + pd.Timedelta(hours=1)).tz_localize("UTC")
# OR
edit_date = pd.Timestamp(start_date, tz="UTC") + pd.Timedelta(hours=1)

Search pattern: pd\.Timestamp\([^)]*\) where no tz= appears — then check if it's used with tz-aware data.

4.2 Using `.date()` on Timezone-Aware Index

Calling .date() on a timezone-aware Timestamp returns a timezone-naive datetime.date, which cannot be used for slicing a timezone-aware index. Use .normalize() instead.

# BEFORE (Pandas 1.x)
end = df.index[-1].date()
result = df.loc[:end, columns]

# AFTER (Pandas 2.x) — .normalize() gives midnight in the same timezone
end = df.index[-1].normalize()
result = df.loc[:end, columns]

Search pattern: \.index\[.*\]\.date\(\)

4.3 Timezone Comparisons

Pandas 2.x may represent UTC using different timezone objects internally. Direct comparison with pytz.UTC or datetime.timezone.utc can fail.

# BEFORE (Pandas 1.x)
import datetime as dt
assert df.index.tz == dt.timezone.utc

# AFTER (Pandas 2.x) — flexible check
assert str(df.index.tz) in ("UTC", "UTC+00:00") or df.index.tz == dt.timezone.utc

Search pattern: \.tz\s*==

4.4 `pd.to_datetime()` Format Inference

Pandas 2.x no longer guesses the format when a column contains mixed date formats. If your data has inconsistent formats, you must handle this explicitly.

Cross-Version Note

The format="mixed" and format="ISO8601" parameters are only available in Pandas 2.0+. If your code must run on both Pandas 1.x and 2.x, use the fallback pattern below.

Cross-version fallback pattern (recommended):

# Safe for both Pandas 1.x and 2.x
def parse_dates(series, dateformat=None, utc=False):
    """Parse dates with graduated fallback for cross-version compatibility."""
    try:
        return pd.to_datetime(series, format=dateformat, utc=utc)
    except ValueError:
        # Pandas 2.x enforces strict format matching. Try fallback strategies
        # to handle minor format variations (e.g. ISO 'T' separator vs space).
        fallbacks = [None, "mixed"] if dateformat else ["mixed"]
        for fmt in fallbacks:
            try:
                return pd.to_datetime(series, format=fmt, utc=utc)
            except (ValueError, TypeError):
                continue
        raise

This pattern works because:

On Pandas 1.x, the initial call with format=dateformat usually succeeds (lenient matching), and format=None also works as a fallback.
On Pandas 2.x, if strict matching fails, format=None (auto-infer) is tried first, then format="mixed" as a last resort.
The format="mixed" call is only reached on Pandas 2.x where it's available.

If you only need Pandas 2.x support:

# Pandas 2.x only
dates = pd.to_datetime(series, format="mixed")
# OR for ISO 8601 strings
dates = pd.to_datetime(series, format="ISO8601")
# OR specify exact format
dates = pd.to_datetime(series, format="%Y-%m-%d %H:%M:%S")

When to use which:

format="mixed": Data contains multiple different formats (e.g., some rows "2024-01-01", others "01/01/2024")
format="ISO8601": All dates are ISO 8601 but with varying precision (e.g., some with seconds, some without)
Explicit format string: All dates follow the same format

Search pattern: pd\.to_datetime\( where no format= parameter is specified — check if the input data could have mixed formats.

4.5 `datetime64` Resolution Changes

Pandas 2.x supports multiple datetime resolutions (datetime64[s], [ms], [us], [ns]) instead of only nanoseconds. This can cause issues when combining data with different resolutions.

# If you encounter resolution mismatch errors:
df.index = df.index.as_unit("ns")  # convert to nanoseconds
# OR
series = series.dt.as_unit("ns")

5. DataFrame Operation Changes

5.1 Year-String Indexing

In Pandas 2.x, df["2024"] looks for a column named "2024" rather than filtering a DatetimeIndex by year.

# BEFORE (Pandas 1.x)
result = df["2024"]
result = df["2024"]["column_name"]

# AFTER (Pandas 2.x) — use .loc[]
result = df.loc["2024"]
result = df.loc["2024", "column_name"]

Search pattern: df\["[0-9]{4}"\]

5.2 Mixed-Type DataFrame Operations

Operations like .sum(), comparisons (<, >), and .clip() now raise errors when the DataFrame contains non-numeric columns (e.g., datetime or string columns).

# BEFORE (Pandas 1.x) — silently skipped non-numeric columns
total = df.sum()
negative_mask = df < 0

# AFTER (Pandas 2.x) — select numeric columns first
total = df.sum(numeric_only=True)
# OR
total = df[column_name].sum()

numeric_cols = df.select_dtypes(include=["number"]).columns
negative_mask = df[numeric_cols] < 0

For .clip():

# BEFORE (Pandas 1.x)
df = df.clip(lower=0)

# AFTER (Pandas 2.x)
numeric_cols = df.select_dtypes(include=["number"]).columns
df[numeric_cols] = df[numeric_cols].clip(lower=0)

Search pattern: \.sum\(\), \.clip\(, df\s*[<>] — check if the DataFrame could contain non-numeric columns.

5.3 `DataFrame.columns & list` Deprecated

Using the & operator between an Index and a list is deprecated. Use .intersection() instead.

# BEFORE (Pandas 1.x)
columns = df.columns & ["col1", "col2", "col3"]

# AFTER (Pandas 2.x)
columns = df.columns.intersection(["col1", "col2", "col3"])

Search pattern: \.columns\s*&\s*\[

5.4 `groupby([single_column])` Key Type Changed

When grouping by a single column wrapped in a list, Pandas 2.x returns tuple keys (e.g., ("A",)) instead of scalar keys (e.g., "A"). Remove the list wrapper for single-column groupby.

# BEFORE (Pandas 1.x) — key is "A" (scalar)
for key, group in df.groupby([column_name]):
    print(key)  # "A"

# AFTER (Pandas 2.x) — remove list wrapper to get scalar keys
for key, group in df.groupby(column_name):
    print(key)  # "A"

Important: Only change this for single column groupby. Multi-column groupby should keep the list:

# Multi-column — keep the list
for key, group in df.groupby([col1, col2]):
    print(key)  # ("A", "B") — tuple in both versions

Search pattern: \.groupby\(\[ — check if only one column is inside the brackets.

5.5 `value_counts().reset_index()` Column Names Changed

The output column names from .value_counts().reset_index() have changed.

# Pandas 1.x result columns: ["index", "column_name"]
# Pandas 2.x result columns: ["column_name", "count"]

# BEFORE (Pandas 1.x)
counts = df["status"].value_counts().reset_index()
value = counts["index"][0]
count = counts["status"][0]

# AFTER (Pandas 2.x) — use positional access for version-agnostic code
counts = df["status"].value_counts().reset_index()
value_col = counts.columns[0]  # the original values
count_col = counts.columns[1]  # the counts
value = counts[value_col][0]
count = counts[count_col][0]

Search pattern: \.value_counts\(\)\.reset_index\(\)

5.6 Series Assignment to Filtered DataFrames

Assigning a Series to filtered DataFrame rows can fail when the index has duplicate labels. Use .values to convert to a numpy array first.

# BEFORE (Pandas 1.x)
df.loc[mask, "column"] = some_series

# AFTER (Pandas 2.x) — use .values to bypass reindexing
df.loc[mask, "column"] = some_series[mask].values

Search pattern: \.loc\[.*\]\s*=.*[^.]\bvalues\b — look for .loc[mask, col] = series without .values.

5.7 Copy-on-Write and Chained Indexing

Pandas 2.x enables Copy-on-Write by default. Chained indexing (getting a value through two successive [] operations) no longer modifies the original DataFrame.

# BEFORE (Pandas 1.x) — modified df in place
df["column"][mask] = new_value

# AFTER (Pandas 2.x) — use .loc[] for direct modification
df.loc[mask, "column"] = new_value

Search pattern: df\[["'][^"']+["']\]\[ — look for df["col"][...].

5.8 `inplace=True` Deprecated

The inplace parameter is deprecated on most DataFrame/Series methods. Use assignment instead.

# BEFORE (Pandas 1.x)
df.reset_index(inplace=True)
df.sort_values("col", inplace=True)
df.drop(columns=["col"], inplace=True)
df.fillna(0, inplace=True)

# AFTER (Pandas 2.x)
df = df.reset_index()
df = df.sort_values("col")
df = df.drop(columns=["col"])
df = df.fillna(0)

Search pattern: inplace\s*=\s*True

6. Timedelta Handling

Timedelta operations have several compatibility pitfalls. This section covers patterns that are safe across Pandas versions.

6.1 `pd.Timedelta` with Year/Month Units

Year ("Y", "y") and month ("M") units are no longer accepted because they're ambiguous (a year can be 365 or 366 days; a month can be 28–31 days).

# BEFORE (Pandas 1.x)
pd.Timedelta("100y")
pd.Timedelta("6M")

# AFTER (Pandas 2.x) — use explicit days
pd.Timedelta(days=36500)   # ~100 years
pd.Timedelta(days=180)     # ~6 months

# OR use DateOffset for calendar-aware offsets
pd.DateOffset(years=100)
pd.DateOffset(months=6)

Note: pd.DateOffset respects calendar months/years (e.g., adding 1 month to Jan 31 gives Feb 28), while pd.Timedelta(days=30) always adds exactly 30 days. Use whichever is correct for your business logic.

Search pattern: pd\.Timedelta\(\s*["'][0-9]+[yYmM]["']\s*\)

6.2 Converting Timedelta to Seconds or Days

The .dt.total_seconds() accessor and division by pd.Timedelta() can behave differently across versions. The safest cross-version approach uses numpy:

import numpy as np

# BEFORE (Pandas 1.x) — may fail or give wrong results in 2.x
seconds = timedelta_series.dt.total_seconds()
seconds = timedelta_series / pd.Timedelta(seconds=1)
days = timedelta_series.dt.days

# AFTER (safe across versions) — use numpy timedelta64
seconds = pd.Series(
    timedelta_series.values / np.timedelta64(1, 's'),
    index=timedelta_series.index
)
days = pd.Series(
    timedelta_series.values / np.timedelta64(1, 'D'),
    index=timedelta_series.index
).astype(int)

Key rule: Always use .values to get the underlying numpy array before dividing by np.timedelta64().

Search pattern: \.dt\.total_seconds\(\), \.dt\.days, /\s*pd\.Timedelta\(

6.3 Timedelta `.astype(int)` Returns Nanoseconds

When you call .astype(int) on a timedelta column, it converts to the internal representation (nanoseconds), not seconds. This can silently produce values that are 1,000,000,000x larger than expected.

# BEFORE (Pandas 1.x) — often appeared to work because of implicit conversions
interval_seconds = timedelta_column.astype(int)
# Danger: returns nanoseconds, not seconds!

# AFTER — convert to seconds explicitly first
interval_seconds = (timedelta_column.values / np.timedelta64(1, 's')).astype(int)

Search pattern: timedelta.*\.astype\(\s*int\s*\), \.astype\(\s*int\s*\) on columns that might contain timedelta values.

6.4 Extracting Values from Timedelta Columns

When you extract a single value from a timedelta column (e.g., df[column][0]), the result type depends on the pandas version and operation history. It may be:

A float (seconds) — use directly
A timedelta64 object — divide by np.timedelta64(1, 's')
An int64 containing nanoseconds — divide by 1,000,000,000

import numpy as np

# Robust extraction pattern
raw_value = df[column].iloc[0]
try:
    numeric_value = int(raw_value)
except (TypeError, ValueError):
    # It's a timedelta object — convert to seconds
    value_in_seconds = int(raw_value / np.timedelta64(1, 's'))
else:
    # Check if it's nanoseconds (> ~10 years in seconds)
    if numeric_value > 315_360_000:
        value_in_seconds = numeric_value // 1_000_000_000
    else:
        value_in_seconds = numeric_value

7. Migrating Flow Rules

This section walks through the most common patterns found in Energyworx Flow rules and how to update them.

7.1 Timezone Conversion Pattern

This is the most common pattern in rules — converting between UTC and local time:

class MyRule(AbstractRule):
    def apply(self, **kwargs):
        local_tz = self.datasource.timezone
        df = self.dataframe[[self.source_column]].copy()

        # Convert to local time for business logic
        df = df.tz_convert(local_tz)

        # MIGRATION CHECK: If you create timestamps for slicing,
        # make sure they are timezone-aware
        # BEFORE:
        start = pd.Timestamp("2024-01-01")
        # AFTER:
        start = pd.Timestamp("2024-01-01", tz=local_tz)

        # Process...
        result = df.loc[start:]

        # Convert back to UTC
        result = result.tz_convert("UTC")
        return RuleResult(result=result)

7.2 Resampling Pattern

Many rules aggregate data using .resample(). Update frequency aliases and check closed parameter usage:

class MyAggregationRule(AbstractRule):
    def apply(self, interval="h", **kwargs):
        df = self.dataframe[[self.source_column]].copy()

        # BEFORE:
        resampled = df.resample("H", closed="right", label="right").sum()
        # AFTER:
        resampled = df.resample("h", closed="right", label="right").sum()

        # Note: 'closed' parameter still works on resample() — only
        # pd.date_range() replaced it with 'inclusive'.

        return RuleResult(result=resampled)

7.3 Date Range Generation Pattern

Rules that generate date ranges (e.g., for gap filling or profile creation):

class MyGapFillRule(AbstractRule):
    def apply(self, heartbeat=3600, **kwargs):
        start = self.dataframe.index[0]
        end = self.dataframe.index[-1]

        # BEFORE:
        full_range = pd.date_range(start, end, freq="{}s".format(heartbeat), closed="right")
        # AFTER:
        full_range = pd.date_range(start, end, freq="{}s".format(heartbeat), inclusive="right")

        return RuleResult()

7.4 Using `self.flow_timestamp`

The self.flow_timestamp is always timezone-naive UTC. In Pandas 2.x, you must localize it before using it with timezone-aware data:

class MyRule(AbstractRule):
    def apply(self, **kwargs):
        # BEFORE — worked with implicit conversion:
        flow_ts = pd.Timestamp(self.flow_timestamp)
        df = self.dataframe.loc[:flow_ts]

        # AFTER — explicit timezone:
        flow_ts = pd.Timestamp(self.flow_timestamp, tz="UTC")
        df = self.dataframe.loc[:flow_ts]

        return RuleResult()

7.5 Concatenating DataFrames in Rules

Rules that combine data from multiple sources:

class MyCombineRule(AbstractRule):
    def prepare_context(self, other_datasource_id, **kwargs):
        return {
            "prepare_datasource_ids": [other_datasource_id],
            "other_id": other_datasource_id,
        }

    def apply(self, **kwargs):
        other_ds = self.prepared_datasources[self.context["other_id"]]
        other_df = self.load_timeseries(other_ds.id, [self.source_column],
                                        self.dataframe.index[0],
                                        self.dataframe.index[-1])

        # BEFORE:
        combined = self.dataframe.append(other_df)
        # AFTER:
        combined = pd.concat([self.dataframe, other_df])

        return RuleResult(result=combined)

7.6 Grouper with Frequency

Rules that group by time periods:

class MyMonthlyRule(AbstractRule):
    def apply(self, **kwargs):
        df = self.dataframe[[self.source_column]].copy()

        # BEFORE:
        monthly = df.groupby(pd.Grouper(freq="M")).sum()
        hourly = df.groupby(pd.Grouper(freq="1H")).mean()
        # AFTER:
        monthly = df.groupby(pd.Grouper(freq="M")).sum()   # keep "M" — "ME" is 2.2+ only
        hourly = df.groupby(pd.Grouper(freq="1h")).mean()   # "h" is safe to change now

        return RuleResult(result=monthly)

7.7 Sum on DataFrames with Multiple Column Types

Rules that sum across all columns when some columns are non-numeric:

class MyValidationRule(AbstractRule):
    def apply(self, **kwargs):
        df = self.dataframe.copy()

        # BEFORE — silently skipped datetime columns:
        total = df.sum().values[0]

        # AFTER — specify the column or use numeric_only:
        total = df[self.source_column].sum()
        # OR
        total = df.sum(numeric_only=True).values[0]

        return RuleResult()

8. Migrating Market Adapters

Market Adapters typically use pandas for parsing files and reshaping data. The most common migration issues are in the split() and adapt() methods.

8.1 CSV Parsing with Date Columns

class MyCSVAdapter(PluggableMarketAdapter):
    def adapt(self, content, current_datetime, **kwargs):
        df = pd.read_csv(
            io.StringIO(content),
            # BEFORE:
            parse_dates=["date_col"],
            infer_datetime_format=True,
            # AFTER — remove infer_datetime_format:
            parse_dates=["date_col"],
        )
        # If dates have mixed formats, parse separately:
        # df["date_col"] = pd.to_datetime(df["date_col"], format="mixed")

        return self.normalize_csv(df.to_csv(index=False))

8.2 Groupby for Splitting by Datasource

class MyCSVAdapter(PluggableMarketAdapter):
    def split(self, content, **kwargs):
        df = pd.read_csv(io.StringIO(content), dtype=str)

        # BEFORE — single column in list:
        for datasource_id, group in df.groupby(["meter_id"]):
            # datasource_id was a scalar in 1.x, tuple in 2.x
            yield group.to_csv(index=False)

        # AFTER — remove list wrapper for single column:
        for datasource_id, group in df.groupby("meter_id"):
            # datasource_id is always a scalar
            yield group.to_csv(index=False)

8.3 Horizontal-to-Vertical Format Conversion

Adapters that reshape horizontal (wide) data into vertical (long) format:

class MyHorizontalAdapter(PluggableMarketAdapter):
    def adapt(self, content, current_datetime, **kwargs):
        df = pd.read_csv(io.StringIO(content), dtype=str)
        dates = pd.to_datetime(df["date"])

        # Creating time intervals
        intervals = pd.timedelta_range(start="0h", periods=24, freq="1h")

        # BEFORE:
        all_intervals = intervals.append(pd.TimedeltaIndex([pd.Timedelta(hours=25)]))
        # AFTER:
        all_intervals = intervals.append(pd.TimedeltaIndex([pd.Timedelta(hours=25)]))
        # Note: TimedeltaIndex.append() still works, but prefer concat for DataFrames:
        # all_intervals = pd.TimedeltaIndex(
        #     list(intervals) + [pd.Timedelta(hours=25)]
        # )

        return self.normalize_json(result)

8.4 Excel File Handling

class MyExcelAdapter(PluggableMarketAdapter):
    def adapt(self, content, current_datetime, **kwargs):
        excel_file = pd.ExcelFile(io.BytesIO(content.encode()))
        df = excel_file.parse("Sheet1")

        # BEFORE — if writing Excel output:
        writer = pd.ExcelWriter(output, engine="xlsxwriter")
        df.to_excel(writer, sheet_name="Output")
        writer.save()    # Removed in 2.x

        # AFTER:
        writer = pd.ExcelWriter(output, engine="xlsxwriter")
        df.to_excel(writer, sheet_name="Output")
        writer.close()   # Use close() instead

        return self.normalize_csv(df.to_csv(index=False))

8.5 Replacing `pd.np.nan` in Adapters

Many adapters use pd.np.nan to replace or detect missing values. This alias was removed in Pandas 2.x.

class MyCSVAdapter(PluggableMarketAdapter):
    def split(self, element, **kwargs):
        import pandas as pd
        # BEFORE:
        # group.replace({pd.np.nan: ''}, inplace=True)

        # AFTER:
        import numpy as np
        group = group.replace({np.nan: ''})

        yield group.values.tolist()

8.6 Date Range in Adapters

Adapters that create date ranges for timeseries output:

class MyDomainAdapter(PluggableMarketAdapter):
    def adapt(self, content, current_datetime, **kwargs):
        # BEFORE:
        timestamps = pd.date_range(
            start=dt.datetime(2024, 1, 1, tzinfo=pytz.UTC),
            end=dt.datetime(2024, 1, 1, 23, 0, 0, tzinfo=pytz.UTC),
            freq='1H'     # deprecated alias
        )

        # AFTER:
        timestamps = pd.date_range(
            start=dt.datetime(2024, 1, 1, tzinfo=pytz.UTC),
            end=dt.datetime(2024, 1, 1, 23, 0, 0, tzinfo=pytz.UTC),
            freq='1h'     # lowercase alias
        )

        df = pd.DataFrame({"channel_1": values}, index=timestamps)
        timeseries = self.create_timeseries(df=df, datasource=ds, version=current_datetime)
        self.output_timeseries(timeseries)

9. Error Message Reference

When you encounter one of these errors, use the table to find the fix:

Error Message	Cause	Fix
`'DataFrame' object has no attribute 'append'`	`DataFrame.append()` removed	Use `pd.concat([df1, df2])` — Section 2.1
`got an unexpected keyword argument 'closed'`	`closed` parameter removed from `date_range()`	Use `inclusive=` — Section 2.2
`got an unexpected keyword argument 'method'` on `get_loc`	`method` removed from `get_loc()`	Use `get_indexer()` — Section 2.3
`'DataFrame' object has no attribute 'ix'`	`.ix[]` removed	Use `.loc[]` or `.iloc[]` — Section 2.4
`'XlsxWriter' object has no attribute 'save'`	`save()` removed	Use `.close()` — Section 2.5
`'Index' object has no attribute 'is_monotonic'`	`is_monotonic` removed	Use `is_monotonic_increasing` — Section 2.6
`got an unexpected keyword argument 'infer_datetime_format'`	Parameter removed	Remove the parameter — Section 2.7
`module 'pandas' has no attribute 'np'`	`pd.np` removed	Use `import numpy as np` and `np.nan` — Section 2.8
`FutureWarning: 'H' is deprecated and will be removed...`	Old frequency alias	Use `'h'` — Section 3
`FutureWarning: 'M' is deprecated...use 'ME'...`	Old frequency alias	Keep `'M'` for now — `'ME'` is 2.2+ only. See Section 3
`Units 'M', 'Y' and 'y' do not represent unambiguous timedelta values`	Ambiguous Timedelta unit	Use days: `pd.Timedelta(days=N)` — Section 6.1
`Cannot compare tz-naive and tz-aware datetime-like objects`	Mixed timezone awareness	Add `tz="UTC"` or `.tz_localize("UTC")` — Section 4.1
`cannot reindex on an axis with duplicate labels`	Series reindexing conflict	Use `.values` — Section 5.6
`'DatetimeArray' with dtype datetime64[ns] does not support reduction 'sum'`	Summing datetime columns	Use `numeric_only=True` — Section 5.2
`UFuncBinaryResolutionError`	Timedelta division incompatibility	Use `np.timedelta64()` — Section 6.2

10. Search Patterns for Your Code

Use these patterns to scan your code for potential migration issues. Each can be used with your IDE's search (regex mode) or with grep -E.

Critical — Will Error

# DataFrame/Series.append()
\.append\(

# pd.date_range with closed=
date_range\([^)]*closed\s*=

# Deprecated frequency aliases — safe to change now
(?:resample|date_range|Grouper|Timedelta)\([^)]*["'][^"']*(?<![a-zA-Z])H(?!z)["']

# Deprecated frequency aliases — DO NOT change yet (2.2+ only)
# These will emit FutureWarning but still work. Keep as-is for cross-version compat.
# freq\s*=\s*["']M["']
# freq\s*=\s*["'][AY]["']

# pd.Timedelta with year/month units
pd\.Timedelta\(\s*["'][0-9]+[yYmM]["']

# get_loc with method=
\.get_loc\([^)]*method\s*=

# .ix[] accessor
\.ix\[

# ExcelWriter.save()
\.save\(\)

# is_monotonic (without _increasing/_decreasing)
\.is_monotonic(?!_)

# infer_datetime_format
infer_datetime_format

# pd.np (e.g., pd.np.nan)
pd\.np\.

High Priority — Silent Behavior Changes

# pd.to_datetime without format (check for mixed data)
pd\.to_datetime\([^)]*\)(?![^)]*format\s*=)

# value_counts().reset_index()
\.value_counts\(\)\.reset_index\(\)

# groupby with single column in list
\.groupby\(\[[^\],]+\]\)

# .columns & list
\.columns\s*&\s*\[

# .sum() on DataFrames (check for non-numeric columns)
\.sum\(\s*\)

# Timezone-naive Timestamps used with tz-aware data
pd\.Timestamp\([^)]*\)(?![^)]*tz\s*=)

# .date() on tz-aware timestamps
\.date\(\)

# Year-string indexing
\[["'][0-9]{4}["']\]

# Chained indexing
df\[["'][^"']+["']\]\[

Medium Priority — Deprecation Warnings

# inplace=True
inplace\s*=\s*True

# Timezone comparisons
\.tz\s*==

# .dt.total_seconds() on timedelta
\.dt\.total_seconds\(\)

# timedelta .astype(int)
\.astype\(\s*int\s*\)

11. Testing Your Migration

Step-by-Step Testing Approach

Search your code using the patterns from Section 10 to identify all affected lines.
Apply the fixes from this guide, working through one category at a time.
Run your unit tests if you have them. Pay special attention to:
- Tests that create DataFrames with timezone-aware indices
- Tests that use timedelta operations
- Tests that assert on specific column names after value_counts()
Test with real data on a non-production environment. Check:
- Do resample operations produce the same number of output rows?
- Are timezone conversions producing correct local times?
- Are numeric aggregations (sum, mean) returning the same values?
- Are date ranges generating the correct number of timestamps?

Common Verification Checks

# Verify resample output hasn't changed
# Run with both old and new code, compare:
assert old_result.shape == new_result.shape
assert (old_result.values == new_result.values).all()

# Verify timezone handling
assert df.index.tz is not None, "Index should be timezone-aware"

# Verify date_range output
old_range = pd.date_range(start, end, freq="1h", inclusive="both")
assert len(old_range) == expected_count

Warnings to Watch For

After migration, run your code and watch for these FutureWarning messages in the console output — each indicates something that will break in a future pandas version:

FutureWarning: ... is deprecated and will be removed in a future version — frequency alias needs updating
FutureWarning: The behavior of DataFrame.sum with axis=None is deprecated — add axis= parameter
FutureWarning: Downcasting object dtype arrays... — explicit dtype conversion needed
FutureWarning: Setting an item of incompatible dtype... — check dtype compatibility

This guide is based on the Pandas 1.x → 2.x migration of the Energyworx platform (March 2026). For the official Pandas migration documentation, see the Pandas 2.0 What's New and Pandas 2.2 What's New.

Table of Contents​

1. Quick Reference: What Changed​

2. Removed Methods and Parameters​

2.1 DataFrame.append() and Series.append() Removed​

2.2 pd.date_range(closed=) Removed​

2.3 Index.get_loc(method=) Removed​

2.4 .ix[] Accessor Removed​

2.5 ExcelWriter.save() Removed​

2.6 is_monotonic Removed​

2.7 infer_datetime_format Parameter Removed​

2.8 pd.np Removed​

3. Frequency Alias Changes​

3.1 Safe to Change Now (works in both Pandas 1.x and 2.x)​

3.2 Do NOT Change Yet (only valid in Pandas 2.2+)​

3.3 Common Energyworx Examples​

4. Timezone and Datetime Changes​

4.1 Cannot Mix Timezone-Naive and Timezone-Aware​

4.2 Using .date() on Timezone-Aware Index​

4.3 Timezone Comparisons​

4.4 pd.to_datetime() Format Inference​

4.5 datetime64 Resolution Changes​

5. DataFrame Operation Changes​

5.1 Year-String Indexing​

5.2 Mixed-Type DataFrame Operations​

5.3 DataFrame.columns & list Deprecated​

5.4 groupby([single_column]) Key Type Changed​

5.5 value_counts().reset_index() Column Names Changed​

5.6 Series Assignment to Filtered DataFrames​

5.7 Copy-on-Write and Chained Indexing​

5.8 inplace=True Deprecated​

6. Timedelta Handling​

6.1 pd.Timedelta with Year/Month Units​

6.2 Converting Timedelta to Seconds or Days​

6.3 Timedelta .astype(int) Returns Nanoseconds​

6.4 Extracting Values from Timedelta Columns​

7. Migrating Flow Rules​

7.1 Timezone Conversion Pattern​

7.2 Resampling Pattern​

7.3 Date Range Generation Pattern​

7.4 Using self.flow_timestamp​

7.5 Concatenating DataFrames in Rules​

7.6 Grouper with Frequency​

7.7 Sum on DataFrames with Multiple Column Types​

8. Migrating Market Adapters​

8.1 CSV Parsing with Date Columns​

8.2 Groupby for Splitting by Datasource​

8.3 Horizontal-to-Vertical Format Conversion​

8.4 Excel File Handling​

8.5 Replacing pd.np.nan in Adapters​

8.6 Date Range in Adapters​

9. Error Message Reference​

10. Search Patterns for Your Code​

Critical — Will Error​

High Priority — Silent Behavior Changes​

Medium Priority — Deprecation Warnings​

11. Testing Your Migration​

Step-by-Step Testing Approach​

Common Verification Checks​

Warnings to Watch For​

Table of Contents