Script 1029: Tagging social dimension Geo

Purpose

Tagging social dimension Geo

To Elaborate

The Python script solves the problem of tagging the social dimension Geo based on the campaign name in a given input dataframe. It extracts the Geo from the campaign name and updates the Geo column in the output dataframe. Rows with missing Geo values are dropped.

Walking Through the Code

  1. The configurable parameters for the script are defined, including the list of Geo keywords and the column names for the campaign, account, and Geo in both the input and output dataframes.
  2. The extract_geo function is defined to extract the Geo from a given campaign name. It converts the campaign name to lowercase and checks if any of the Geo keywords are present in the campaign name. If a Geo keyword is found, it is added to a list. The function returns a string of all the Geo values joined by commas.
  3. The input dataframe is copied to the output dataframe.
  4. A loop iterates through each row in the input dataframe.
  5. The campaign name for the current row is extracted.
  6. The extract_geo function is called to extract the Geo from the campaign name.
  7. The campaign name and extracted Geo are printed.
  8. The Geo column in the output dataframe is updated with the extracted Geo value for the current row.
  9. Rows with missing Geo values are dropped from the output dataframe.
  10. If the output dataframe is not empty, it is printed in a tabular format using the tableize function. Otherwise, “Empty outputDf” is printed.

Vitals

  • Script ID : 1029
  • Client ID / Customer ID: 1306927457 / 60270313
  • Action Type: Bulk Upload (Preview)
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Geo
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Autumn Archibald (aarchibald@marinsoftware.com)
  • Created by Autumn Archibald on 2024-04-29 23:18
  • Last Updated by Autumn Archibald on 2024-04-30 04:55
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# Define the configurable parameters for the script
GEO_KEYWORDS = ['us', 'ca', 'apac', 'nordics', 'uk', 'au']
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GEO = 'Geo'

# Function to extract geo from campaign name
def extract_geo(campaign_name):
    campaign_lower = campaign_name.lower()
    geo_list = []
    for geo in GEO_KEYWORDS:
        if geo in campaign_lower:
            geo_list.append(geo)
    return ', '.join(geo_list)

# Copy input rows to output
outputDf = inputDf.copy()

# Loop through all rows
for index, row in inputDf.iterrows():
    campaign_name = row[RPT_COL_CAMPAIGN]

    # Extract geo from campaign name
    geo = extract_geo(campaign_name)
    print("Campaign [%s] => Geo [%s]" % (campaign_name, geo))

    # Update geo column
    outputDf.at[index, BULK_COL_GEO] = geo

# Drop any rows with missing geo values
outputDf = outputDf.dropna(subset=[BULK_COL_GEO])

if not outputDf.empty:
    print("outputDf", tableize(outputDf))
else:
    print("Empty outputDf")

Post generated on 2024-05-15 07:44:05 GMT

comments powered by Disqus