Script 1599: Script Auto Tag Country & Friendly Name
Purpose:
The script extracts country codes and friendly names from campaign names in a dataset.
To Elaborate
The Python script is designed to process a dataset containing campaign information and extract specific details from the campaign names. It focuses on identifying and separating the country code and a friendly name from each campaign name. The script assumes that the campaign names follow a structured format, where these details can be extracted based on their position within the name. The extracted information is then used to populate new columns in the output dataset, which can be used for further analysis or reporting. The script ensures that only rows with successfully extracted data are retained, thereby maintaining the integrity and relevance of the output.
Walking Through the Code
- Data Preparation:
- The script begins by importing necessary libraries and defining the primary data source, which is a DataFrame containing campaign data.
- It specifies the columns of interest from the input data, such as ‘Campaign’, ‘Campaign ID’, and ‘Account’.
- Function Definitions:
- Two functions,
extract_country
andextract_friendlyname
, are defined to parse the campaign name. They split the name by underscores and extract the country code and friendly name based on their positions.
- Two functions,
- Data Processing:
- An output DataFrame is created, copying relevant columns from the input data.
- The extraction functions are applied to the ‘Campaign’ column to populate new columns for country codes and friendly names.
- Data Cleaning:
- Rows where extraction fails (resulting in
None
values) are removed to ensure the output only contains valid data.
- Rows where extraction fails (resulting in
- Output:
- The script prints the first few rows of both the input and output DataFrames for verification purposes.
Vitals
- Script ID : 1599
- Client ID / Customer ID: 1306927959 / 60270403
- Action Type: Bulk Upload
- Item Changed: Campaign
- Output Columns: Account, Campaign, Country_Dim, FriendlyName_Dim, Campaign ID
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
- Created by Grégory Pantaine on 2024-12-23 21:27
- Last Updated by Grégory Pantaine on 2025-01-23 14:25
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
##
## name: Script - Auto Tag - Country & Friendly Name
## description: Extracts country code and friendly name from campaign names
##
## author:
## created: 2024-12-23
##
import datetime
import pandas as pd
today = datetime.datetime.now(CLIENT_TIMEZONE).date()
# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_CAMPAIGN_ID = 'Campaign ID'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN_STATUS = 'Campaign Status'
RPT_COL_PUB_COST = 'Pub. Cost $'
RPT_COL_FRIENDLYNAME_DIM = 'FriendlyName_Dim'
RPT_COL_COUNTRY_DIM = 'Country_Dim'
# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_COUNTRY_DIM = 'Country_Dim'
BULK_COL_FRIENDLYNAME_DIM = 'FriendlyName_Dim'
BULK_COL_CAMPAIGN_ID = 'Campaign ID'
# Function to extract country code
def extract_country(campaign_name):
try:
parts = campaign_name.split('_')
if len(parts) >= 4:
return parts[3][:2] # First two characters after third underscore
return None
except:
return None
# Function to extract friendly name
def extract_friendlyname(campaign_name):
try:
parts = campaign_name.split('_')
if len(parts) >= 4:
return parts[3][2:] # Everything after first two characters after third underscore
return None
except:
return None
# Create output dataframe with required columns
outputDf = inputDf[[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN, RPT_COL_CAMPAIGN_ID]].copy()
outputDf.columns = [BULK_COL_ACCOUNT, BULK_COL_CAMPAIGN, BULK_COL_CAMPAIGN_ID]
# Apply the extraction functions separately
outputDf[BULK_COL_COUNTRY_DIM] = outputDf[BULK_COL_CAMPAIGN].apply(extract_country)
outputDf[BULK_COL_FRIENDLYNAME_DIM] = outputDf[BULK_COL_CAMPAIGN].apply(extract_friendlyname)
# Remove rows where extraction failed (where either value is None)
outputDf = outputDf.dropna(subset=[BULK_COL_COUNTRY_DIM, BULK_COL_FRIENDLYNAME_DIM])
# Debug print first few rows
print(tableize(inputDf.head()))
print("\nOutput DataFrame:")
print(tableize(outputDf.head()))
Post generated on 2025-03-11 01:25:51 GMT