Script 1445: Tag Groups with FUND Dimension

Purpose:

The Python script tags groups with a FUND dimension based on abbreviations found in group names.

To Elaborate

The Python script is designed to enhance a dataset by tagging each group with a corresponding FUND dimension. This is achieved by checking if any predefined abbreviations, sourced from a Google Sheets document, are present in the group names. If an abbreviation is found within a group name, the group is tagged with that abbreviation as its FUND dimension. If no abbreviation is found, the group is tagged with “Unknown”. This process helps in categorizing and organizing data based on specific FUND codes, which can be crucial for structured budget allocation (SBA) and financial reporting.

Walking Through the Code

  1. Data Initialization:
    • The script begins by setting up the primary data source, which contains columns such as ‘Group’, ‘Account’, ‘Campaign’, and ‘FUND’.
    • An output DataFrame is initialized with a column for ‘FUND’, which is initially set to empty strings.
  2. Reference Data Setup:
    • A reference DataFrame is loaded from a Google Sheets document, which contains FUND abbreviations in its first column.
  3. Function Definition:
    • A function tag_fund_from_group is defined to check if any abbreviation from the Google Sheets data is present in a given group name. If found, it returns the abbreviation; otherwise, it returns “Unknown”.
  4. Application of Function:
    • The function is applied to each group name in the input DataFrame. The results are stored in the ‘FUND’ column of the output DataFrame.
  5. Output Preview:
    • Finally, the script prints the first few rows of the updated DataFrame to preview the results.

Vitals

  • Script ID : 1445
  • Client ID / Customer ID: 1306922277 / 60268979
  • Action Type: Bulk Upload
  • Item Changed: AdGroup
  • Output Columns: Account, Campaign, Group, FUND
  • Linked Datasource: M1 Report
  • Reference Datasource: Google Sheets
  • Owner: dwaidhas@marinsoftware.com (dwaidhas@marinsoftware.com)
  • Created by dwaidhas@marinsoftware.com on 2024-10-18 16:56
  • Last Updated by ascott@marinsoftware.com on 2025-02-20 19:19
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
##
## name: Tag Groups with FUND Dimension
## description:
##  
## 
## author: Dana Waidhas 
## created: 2024-10-18
## 

today = datetime.datetime.now(CLIENT_TIMEZONE).date()

# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_GROUP = 'Group'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_GROUP_ID = 'Group ID'
RPT_COL_FUND = 'FUND'

# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_FUND = 'FUND'
outputDf[BULK_COL_FUND] = ""

# Reference data source (Google Sheets) and columns
gSheetsDf = dataSourceDict["2_1"]  # First sheet from Google Sheets
GSHEET_COL_FUND_CODE = gSheetsDf.columns[0]  # Column A in Google Sheets (Abbreviations)

# Initialize FUND column in output DataFrame
outputDf[BULK_COL_FUND] = ""

# Function to check for abbreviation in Group name and tag FUND
def tag_fund_from_group(group_name):
    for abbreviation in gSheetsDf[GSHEET_COL_FUND_CODE]:
        if abbreviation in group_name:
            return abbreviation  # Tag FUND with matching abbreviation
    return "Unknown"  # Default if no abbreviation is found

# Apply the function to each Group in inputDf
outputDf[BULK_COL_FUND] = inputDf[RPT_COL_GROUP].apply(tag_fund_from_group)

# Preview the updated DataFrame
print(outputDf.head())

Post generated on 2025-03-11 01:25:51 GMT

comments powered by Disqus