Use to extract data from an existing Analyze data table into another data table. Examples include, but are not limited to, the following:
Note: There is no actual function exclusive to this transform. All sorting, grouping, filtering, etc. can be performed in any other transform with the Table Data Selection and Data Filters tabs.
To establish the source and target, first select the data table to be exported from the Source Table dropdown menu. Next, select the target file path from PlaidCloud Document using the dropdown menu to select the appropriate account before navigating to the actual directory in the section immediately below. Finally, provide the target file with a descriptive name.
Providing a file extension is advised, but not required by Analyze. The data table will be exported into the appropriate file format with or without an extension.
The Table Data Selection tab is used to map columns from the source data table to the target data table. All source columns on the left side of the window are automatically mapped to the target data table depicted on the right side of the window. Using the Inspect Source menu button, there are a few additional ways to map columns from source to target:
In addition to each of these options, each choice offers the ability to preview the source data.
If the source and target column options aren’t enough, other columns can be added into the target data table in several different ways:
Selecting Propagate All may effectively create a duplicate of every column. Analyze does not check to see if the columns are already mapped. Make sure duplicate column names do not exist.
To delete columns from the target data table, select the desired column(s), then right click and select Delete.
To rearrange columns in the target data table, select the desired column(s), then right click and select Move to Top, Move Up, Move Down, or Move to Bottom.
To return only distinct options, select the Distinct menu option. This will toggle a set of checkboxes for each column in the source. Simply check any box next to the corresponding column to return distinct results only.
When the target data table contains only a subset of the source data table, only select the check box next to the columns which are to be included in the target data table. Selecting all checkboxes could provide output that does not appear to be distinct.
To aggregate results, select the Summarize menu option. This will toggle a set of drop down boxes for each column in the target data table. The following summarization options are available:
For more aggregation details, see the Analyze overview page [here](/docs/analyze/#aggregation).
To allow for maximum flexibility, data filters are available on the source data and the target data. For larger data sets, it can be especially beneficial to filter out rows on the source so the remaining operations are performed on a smaller data set.
Any valid Python expression is acceptable to subset the data. Please see Expressions for more details and examples.
Compound filters must have individual elements wrapped in parentheses. For example, if filtering for Temperature and Humidity, a valid filter would look like this:
To report duplicates, select the Report Duplicates in Table checkbox and then specify an output table which will contain all of the duplicate records.
This will not remove the duplicate items from the target data table. To remove duplicate items, use the Distinct menu options as specified in the [Table Data Selection](../transforms/common_features#table-data-selection) section.
Example code here
To limit the data, check the Apply Row Slicer box and then specify the following:
To limit the data, simply check the Apply Row Slicer box and then specify the following:
In this example, the Source Table, Import Google Spreadsheet, is filtered to include only results in which the temperature was listed at 75 degrees Fahrenheit or above. As such, the Target Table is named Filter Results Temp 75+.
All columns are mapped from source to target. No grouping, sorting, or summarization options are specified.
In the Data Filters tab, the source data is subset with the following expression: row[‘TemperatureF’] >= 75. This expression only keeps rows which have a value in the TemperatureF column equal to 75 or higher.
As expected on an Ohio summer day, the temperature first climbs above 75 degrees around noon and then remains there until nearly 10 PM.
In this example, the same Source Table, Import Google Spreadsheet, is used, but in this case, it will be used to identify distinct conditions reported throughout the day. Accordingly, the Target Table is named Distinct Conditions.
In this case, only a single column from the source data table is mapped to the target data table. Additionally, the Make Distinct button has been selected and applied only to the Conditions column. This should return only distinct values found in the source data table.
When the target data table contains only a subset of the source data table, select the check box next to only the columns which are to be included in the target data table. Selecting all checkboxes could provide output that does not appear to be distinct.
Since this example is looking for distinct values, it may be helpful to also identify non-distinct duplicate values. As such, any values which exist as duplicates will be added to the duplicate values from raw data input data table.
On this day, there were 4 unique conditions reported throughout the day.
For an example showing how to use a mathematical expression to populate a value in an additional column, please see the Calculate Harris Score section of the BCS Demo.
For an example showing how to sort the target data table by multiple columns, please see the Sort Rankings by Team section of the BCS Demo.
For an example showing an if/then/else conditional expression in lambda-like syntax (single line), please see the Convert Rankings to Points section of the BCS Demo.
For an example showing how to group and summarize results, please see the Calculate Total Computer Points section of the BCS Demo.