Script 1447: Auto Tag Groups with Fund type Dimension using Gsheet
Purpose:
The script automates the process of tagging groups with fund type dimensions by matching data from a primary source with a reference Google Sheet.
To Elaborate
The Python script is designed to automate the process of tagging groups with fund type dimensions by using data from a primary data source and a reference Google Sheet. The script reads fund information from the primary data source and attempts to match each fund with corresponding fund type information from the Google Sheet. If a match is found, the script assigns the appropriate fund type to the output data. If no match is found, it assigns a default value of “Unknown” to indicate that the fund type could not be determined. This process helps in organizing and categorizing financial data based on fund types, which is crucial for structured budget allocation (SBA) and financial analysis.
Walking Through the Code
- Initialization and Data Source Setup
- The script begins by setting up the current date and defining the primary data source (
inputDf
) and its relevant columns, such as ‘Group’, ‘Account’, ‘Campaign’, ‘Group ID’, ‘FUND’, and ‘FUND Type’. - It also sets up a reference data source from Google Sheets (
gSheetsDf
) and identifies the columns to be used for matching: the first column for fund codes, the second for funds, and the third for fund types.
- The script begins by setting up the current date and defining the primary data source (
- Output DataFrame Preparation
- The script initializes the output DataFrame (
outputDf
) by copying the ‘FUND’ column from the primary data source and setting the ‘FUND Type’ column to an empty string, preparing it for the assignment of matched fund types.
- The script initializes the output DataFrame (
- Matching Process
- The script iterates over each row in the primary data source, extracting the fund value and attempting to find a matching row in the Google Sheet based on the fund code.
- If a match is found, it retrieves the corresponding fund type from the Google Sheet and assigns it to the ‘FUND Type’ column in the output DataFrame.
- If no match is found, it logs the absence of a match and assigns “Unknown” to the ‘FUND Type’ column.
- Output and Validation
- Finally, the script prints the first few rows of the output DataFrame to verify the results.
- It includes a workaround to skip output validations due to limitations in script settings related to group dimension columns.
Vitals
- Script ID : 1447
- Client ID / Customer ID: 1306922277 / 60268979
- Action Type: Bulk Upload
- Item Changed: AdGroup
- Output Columns: Account, Campaign, Group, Group ID
- Linked Datasource: M1 Report
- Reference Datasource: Google Sheets
- Owner: dwaidhas@marinsoftware.com (dwaidhas@marinsoftware.com)
- Created by dwaidhas@marinsoftware.com on 2024-10-18 18:16
- Last Updated by Michael Huang on 2025-02-10 03:23
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
##
## name: Auto Tag Groups with Fund and Fund type Dimension using Gsheet
## description:
##
##
## author:
## created: 2024-10-18
##
today = datetime.datetime.now(CLIENT_TIMEZONE).date()
# Primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_GROUP = 'Group'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_GROUP_ID = 'Group ID'
RPT_COL_FUND = 'FUND'
RPT_COL_FUND_TYPE = 'FUND Type'
# Reference data source (Google Sheets) and columns
gSheetsDf = dataSourceDict["2_1"] # Use the first sheet from GSheets
# Let's define the first column in gSheets as the column to match with RPT_COL_FUND
GSHEET_COL_FUND_CODE = gSheetsDf.columns[0] # First column of gSheets
# Ensure you have the correct columns for FUND and FUND TYPE in gSheets
GSHEET_COL_FUND = gSheetsDf.columns[1] # Second column is FUND
GSHEET_COL_FUND_TYPE = gSheetsDf.columns[2] # Third column is FUND TYPE
# Output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_FUND = 'FUND'
BULK_COL_FUND_TYPE = 'FUND Type'
outputDf[BULK_COL_FUND] = inputDf[RPT_COL_FUND]
outputDf[BULK_COL_FUND_TYPE] = "" # Initialize as empty
# Perform matching based on the first column (Fund Code) in gSheetsDf
for index, row in inputDf.iterrows():
fund = row[RPT_COL_FUND]
# Print the value of fund to check what you're trying to match
print(f"Trying to match fund: {fund}")
# Find matching row in the GSheet's first column (GSHEET_COL_FUND_CODE)
matching_row = gSheetsDf[gSheetsDf[GSHEET_COL_FUND_CODE] == fund]
# Log if we find any match
if not matching_row.empty:
fund_type = matching_row[GSHEET_COL_FUND_TYPE].values[0]
print(f"Match found: {fund} -> {fund_type}")
# If a match is found, assign the corresponding Fund Type (from Column 3) to the output
outputDf.at[index, BULK_COL_FUND_TYPE] = fund_type
else:
# Log if no match is found
print(f"No matching Fund Code found for: {fund}")
outputDf.at[index, BULK_COL_FUND_TYPE] = "Unknown" # Default or placeholder value
# Display the first few rows of outputDf
print(outputDf.head())
# workaround to Preview warning:
# skip validation because Group Dimension columns can't be selected in Scripts Settings
skip_output_validations = True
Post generated on 2025-03-11 01:25:51 GMT