Script 1029: Tagging social dimension Geo
Purpose
The Python script identifies and tags geographical regions in campaign names based on predefined keywords.
To Elaborate
The script is designed to process a dataset of marketing campaigns, specifically focusing on identifying geographical regions mentioned within campaign names. It uses a predefined list of geographical keywords to search through each campaign name, extracting any matching keywords that indicate a geographical region. The extracted geographical information is then added to a new column in the dataset. This process helps in categorizing and analyzing campaigns based on their geographical focus, which can be crucial for targeted marketing strategies and budget allocation. The script also ensures that only campaigns with identified geographical tags are retained in the final output, thereby filtering out any irrelevant data.
Walking Through the Code
- Define Configurable Parameters
- The script begins by defining a list of geographical keywords (
GEO_KEYWORDS
) that it will search for within campaign names. These keywords are user-changeable and can be adjusted to include any relevant geographical identifiers. - It also sets up column names for both the input and output dataframes, which are used to reference specific columns in the dataset.
- The script begins by defining a list of geographical keywords (
- Extract Geo Function
- A function named
extract_geo
is defined to process each campaign name. It converts the campaign name to lowercase and checks for the presence of any keywords from theGEO_KEYWORDS
list. - If a keyword is found, it is added to a list, which is then joined into a string representing all identified geographical regions for that campaign.
- A function named
- Process Input Data
- The script creates a copy of the input dataframe to preserve the original data.
- It iterates over each row in the input dataframe, extracting the campaign name and using the
extract_geo
function to determine the geographical tags.
- Update and Filter Output Data
- The identified geographical tags are added to a new column in the output dataframe.
- Rows without any geographical tags are removed from the output dataframe to ensure only relevant data is retained.
- Finally, the script checks if the output dataframe is empty and prints the results accordingly.
Vitals
- Script ID : 1029
- Client ID / Customer ID: 1306927457 / 60270313
- Action Type: Bulk Upload (Preview)
- Item Changed: Campaign
- Output Columns: Account, Campaign, Geo
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Autumn Archibald (aarchibald@marinsoftware.com)
- Created by Autumn Archibald on 2024-04-29 23:18
- Last Updated by Autumn Archibald on 2024-04-30 04:55
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# Define the configurable parameters for the script
GEO_KEYWORDS = ['us', 'ca', 'apac', 'nordics', 'uk', 'au']
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GEO = 'Geo'
# Function to extract geo from campaign name
def extract_geo(campaign_name):
campaign_lower = campaign_name.lower()
geo_list = []
for geo in GEO_KEYWORDS:
if geo in campaign_lower:
geo_list.append(geo)
return ', '.join(geo_list)
# Copy input rows to output
outputDf = inputDf.copy()
# Loop through all rows
for index, row in inputDf.iterrows():
campaign_name = row[RPT_COL_CAMPAIGN]
# Extract geo from campaign name
geo = extract_geo(campaign_name)
print("Campaign [%s] => Geo [%s]" % (campaign_name, geo))
# Update geo column
outputDf.at[index, BULK_COL_GEO] = geo
# Drop any rows with missing geo values
outputDf = outputDf.dropna(subset=[BULK_COL_GEO])
if not outputDf.empty:
print("outputDf", tableize(outputDf))
else:
print("Empty outputDf")
Post generated on 2024-11-27 06:58:46 GMT