Script 1029: Tagging social dimension Geo
Purpose:
The Python script identifies and tags geographical regions in campaign names based on predefined keywords.
To Elaborate
The script is designed to analyze campaign names and identify geographical regions mentioned within them. It uses a predefined list of geographical keywords to search through each campaign name, extracting any matching keywords that indicate a specific region. The extracted geographical information is then tagged to each campaign, allowing for better organization and analysis of campaigns based on their geographical focus. The script also ensures that only campaigns with identified geographical tags are retained, removing any entries that do not have associated geographical information.
Walking Through the Code
- Define Configurable Parameters:
- The script begins by defining a list of geographical keywords (
GEO_KEYWORDS
) that are used to identify regions within campaign names. These keywords are user-changeable and can be updated to include additional regions as needed.
- The script begins by defining a list of geographical keywords (
- Extract Geo Function:
- A function named
extract_geo
is defined to process each campaign name. It converts the campaign name to lowercase and checks for the presence of any keywords from theGEO_KEYWORDS
list. If a keyword is found, it is added to a list, which is then returned as a comma-separated string.
- A function named
- Copy and Process Data:
- The script copies the input data to a new DataFrame (
outputDf
) to preserve the original data. It iterates over each row, extracting the campaign name and using theextract_geo
function to identify geographical tags.
- The script copies the input data to a new DataFrame (
- Update Geo Column:
- For each campaign, the identified geographical tags are updated in the
Geo
column of theoutputDf
. The script prints the campaign name alongside its identified geographical tags for verification.
- For each campaign, the identified geographical tags are updated in the
- Filter Data:
- The script removes any rows from
outputDf
that do not have geographical tags, ensuring that only relevant campaigns are retained for further analysis.
- The script removes any rows from
- Output Verification:
- Finally, the script checks if the
outputDf
is empty and prints the tableized output if it contains data, or a message indicating that it is empty.
- Finally, the script checks if the
Vitals
- Script ID : 1029
- Client ID / Customer ID: 1306927457 / 60270313
- Action Type: Bulk Upload (Preview)
- Item Changed: Campaign
- Output Columns: Account, Campaign, Geo
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Autumn Archibald (aarchibald@marinsoftware.com)
- Created by Autumn Archibald on 2024-04-29 23:18
- Last Updated by Autumn Archibald on 2024-04-30 04:55
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# Define the configurable parameters for the script
GEO_KEYWORDS = ['us', 'ca', 'apac', 'nordics', 'uk', 'au']
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GEO = 'Geo'
# Function to extract geo from campaign name
def extract_geo(campaign_name):
campaign_lower = campaign_name.lower()
geo_list = []
for geo in GEO_KEYWORDS:
if geo in campaign_lower:
geo_list.append(geo)
return ', '.join(geo_list)
# Copy input rows to output
outputDf = inputDf.copy()
# Loop through all rows
for index, row in inputDf.iterrows():
campaign_name = row[RPT_COL_CAMPAIGN]
# Extract geo from campaign name
geo = extract_geo(campaign_name)
print("Campaign [%s] => Geo [%s]" % (campaign_name, geo))
# Update geo column
outputDf.at[index, BULK_COL_GEO] = geo
# Drop any rows with missing geo values
outputDf = outputDf.dropna(subset=[BULK_COL_GEO])
if not outputDf.empty:
print("outputDf", tableize(outputDf))
else:
print("Empty outputDf")
Post generated on 2025-03-11 01:25:51 GMT