In python pandas we don't have this behavior as default after aggregating some dataframe, but we can do it easily we a few lines of code.
Since I could not find this solution on Stack Overflow. But first, let's see what happen with a column with string type when we do not use a function like firtst:
import pandas as pd df = pd.DataFrame.from_records([ dict(k=1, i=10, t="a"), dict(k=1, i=20, t="b"), dict(k=1, i=20, t="c"), ]) df.groupby("k", as_index=False).sum()If we run the code bellow, we get this result:
k i 0 1 50You can see that the column t was removed since you cannot sum it.
Now, let's add the first aggregation function to this column:
first = lambda a: a.values[0] if len(a) > 0 else None df.groupby("k", as_index=False).agg({'i': sum, 't': first})If we run the code bellow, we get this result:
k i t 0 1 50 a
Solved