df.shift() method in Pandas
Introduction
The df.shift() method is used to shift the index of a Series or DataFrame by a specified number of periods. This is essential for calculating changes over time or comparing current values with previous ones.
Real-world use case: Comparing today's stock price with yesterday's price to calculate daily returns.
Syntax
Python
df.shift(periods=1, freq=None, axis=0, fill_value=None)
- periods: Number of periods to shift. Positive for down (lag), negative for up (lead).
Example: Lagging Data
Python
import pandas as pd
df = pd.DataFrame({'Price': [100, 105, 102, 110]})
# 1. Shift down (Yesterday's price)
df['Prev_Price'] = df['Price'].shift(1)
# 2. Shift up (Tomorrow's price)
df['Next_Price'] = df['Price'].shift(-1)
print(df) The output of the above code is:
Output
Price Prev_Price Next_Price 0 100 NaN 105.0 1 105 100.0 102.0 2 102 105.0 110.0 3 110 102.0 NaN
Why use Shift?
- Time-series comparison: Calculate row-to-row differences easily.
- Feature Engineering: Create "Lag Features" for machine learning models (using past values to predict future ones).
🚀 Best Practices
- Shifting will always introduce NaN values at the edges. Use fill_value if you want to avoid them.
- If you have a datetime index, use the freq parameter to shift by time units (like days or months) instead of just rows.
- Remember: shift(1) is a "Lag" (past), and shift(-1) is a "Lead" (future).