Current location - Loan Platform Complete Network - Big data management - Four common functions in pandas
Four common functions in pandas

obj : the object to be merged, e.g. [df1, df2]

axis : the direction of the merge

join : the way to merge, outer is an outer link, take the intersection

join_axes : set the column names to be displayed

ignore_index : whether to ignore the indexes of the original DataFrame/Series object and rearrange it

keys : set multiple keys for the data source. strong> : set multi-level index labels for the data source

levels :Specifies the indexes to be used as levels (inner indexes) of the hierarchical index, if keys is set

names :Names used to create hierarchical levels, if keys or levels is set

names :Names used to create hierarchical levels, if keys or levels is set. keys or levels

verify_integrity : Checks for duplicate indexes and raises an exception

left : the left DataFrame involved in the merge

right : the right DataFrame involved in the merge

how : joins: 'inner' (default, intersection); also, 'outer ', 'left', 'right'

on : the name of the column used for the join, which must be present in both left and right DataFrame objects, if specified, the intersection of the left and right column names is used as the join key

left_on : the column used as the join key in the left DataFarme

right_on : the column used as the join key in the right Column used as join key in DataFarme

left_index : Use the left row index as its join key

right_index : Use the right row index as its join key

sort : Sort the merged data by concatenation key, defaults to True, sometimes disable for better performance when working with large datasets

suffixes : A tuple of string values to append to the end of the overlapping columns, defaults to (''

''). The default is ('_x','_y'). For example, if both the left and right DataFrame objects have 'data', the result will be 'data_x', 'data_y'

copy : Set to False to avoid copying data into the result data structure in some special cases. By default, it is always assigned

It's easy to get started, and previous bloggers have written about it in great detail

Pandas Explained XV - Grouping with GroupBy Technology

A few additional points:

data : A column of the DataFrame used to create the pivot table. data, just enter the name of the column

index : row grouping tag

columns : column grouping tag

aggfunc : aggregate calculation method, defaults to (mean). Dictionary can be used to specify different aggregation functions for different columns, where data can be missing

fill_value : fill missing values

dropna : drop missing values

margins : whether or not the margins are removed

margins : whether or not the margins are removed

margins : whether or not the margins are removed. strong> : whether or not to aggfunc summarize the margins

margins_name : the name of the margin row/column