EXPAND ALL
  • Home

DataFrame.DataFrame

Sets up a DataFrame object from the specified table.

Sets up the loading procedure of the table into the rest of the execution engine. The returned value can be transformed, aggregated and filtered using the DataFrame methods. Note that we are not actually loading data until the entire query is compiled, meaning that running this by itself won't do anything until a full pipeline is constructed. DataFrame is able to load in any set of tables. See px.GetSchemas() for a list of tables and the columns that can be loaded.

Arguments

NameTypeDescription
tablestringThe table name to load.
selectList[str]]The columns of the table to load. Leave empty if you want to select all.
start_timepx.TimeThe earliest timestamp of data to load. The format can be one of the following: (1) relative time with format "-5m" or "-3h", (2) absolute time with format "2020-07-13 18:02:5.00 +0000", (3) absolute time in nanoseconds, or (4) `None`. Defaults to `None`. If `start_time` is `None`, then it begins with the first record in the table.
end_timepx.TimeThe last timestamp of data to load. The format can be one of the following: (1) relative time with format "-5m" or "-3h", (2) absolute time with format "2020-07-13 18:02:5.00 +0000", (3) absolute time in nanoseconds, or (4) `None`. Defaults to `None`. If `end_time` is `None` and `df.stream()` was not called on this DataFrame, then this DataFrame will process data until the last record that was in the table at the beginning of query execution. If `end_time` is `None` and `df.stream()` was called on this DataFrame, then this DataFrame will process data indefinitely.

Returns

px.DataFrame: DataFrame loaded from the table with the specified columns and time period.

Examples

# Select all columns
df = px.DataFrame('http_events', start_time='-5m')
# Select subset of columns.
df = px.DataFrame('http_events', select=['upid', 'req_body'], start_time='-5m')
# Absolute time specification.
df = px.DataFrame('http_events', start_time='2020-07-13 18:02:5.00 -0700')
# Absolute time sepecification (nanoseconds). Note this format only works for PxL scripts;
# The Live UI's `start_time` argument does not support this format.
df = px.DataFrame('http_events', start_time=1646157769000000000)

This site uses cookies to provide you with a better user experience. By using Pixie, you consent to our use of cookies.