Script 1523: Script AutoTag Campaign ID Number
Purpose:
The script extracts and tags the numeric value before the first dash in a campaign name, removing any parentheses.
To Elaborate
The Python script is designed to process a dataset containing campaign information, specifically focusing on extracting a numeric identifier from the campaign name. The script identifies the numeric value enclosed in parentheses before the first dash in the campaign name and assigns it as the “Campaign ID #”. This operation is crucial for organizing and tagging campaigns with a unique identifier, which can be used for further analysis or reporting. The script ensures that only valid numeric identifiers are tagged and removes any unnecessary whitespace from the campaign names to maintain data consistency.
Walking Through the Code
- Configurable Parameters:
- The script begins by defining a configurable parameter
PLACEMENT_KEY
, which is set to the dash character (‘-‘). This key is used to identify the position in the campaign name where the numeric value should be extracted.
- The script begins by defining a configurable parameter
- Data Preparation:
- The primary data source is loaded into
inputDf
, which contains columns for campaign and account information. The script prepares an output DataFrameoutputDf
by copying the input data.
- The primary data source is loaded into
- Function Definition:
- A function
get_value_before_dash
is defined to extract the numeric value from the campaign name. It uses a regular expression to match and extract numbers enclosed in parentheses.
- A function
- Data Processing:
- The script iterates over each row in the input DataFrame. For each campaign name, it checks if the
PLACEMENT_KEY
is present. If so, it calls theget_value_before_dash
function to extract the numeric value.
- The script iterates over each row in the input DataFrame. For each campaign name, it checks if the
- Tagging and Cleaning:
- If a valid numeric value is found, it is assigned to the “Campaign ID #” column in the output DataFrame. The script also removes any rows where the “Campaign ID #” is not available and strips extra whitespace from the campaign names.
- Output:
- Finally, the script checks if the output DataFrame is empty and prints the first few rows if it contains data, ensuring that the processed information is ready for further use.
Vitals
- Script ID : 1523
- Client ID / Customer ID: 1306928453 / 60270539
- Action Type: Bulk Upload
- Item Changed: Campaign
- Output Columns: Account, Campaign, Campaign ID #
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
- Created by Grégory Pantaine on 2024-11-14 17:23
- Last Updated by emerryfield@marinsoftware.com on 2025-01-21 22:58
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
##
## name: Script AutoTag - Campaign ID Number
## description: Tags the value before the first - in the campaign name as the number without the brackets,
## ie: 5802
##
## author: G Pantaine with help from ChatGPT & M Huang.
## created: 2024-11-14
##
# Configurable Params - START
PLACEMENT_KEY = '-'
# Primary data source and columns
inputDf = dataSourceDict["1"]
# Output columns and initial values
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_NUMERO_PROJET = 'Campaign ID #'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_NUMERO_PROJET = 'Campaign ID #'
# Function to extract the value before the first '-' and remove parentheses
def get_value_before_dash(campaign_name):
match = re.match(r"\((\d+)\)", campaign_name)
if match:
return match.group(1)
else:
print("Value not found: " + campaign_name)
return np.nan
# Copy all input rows to output
outputDf = inputDf.copy()
# Loop through all rows
for index, row in inputDf.iterrows():
campaign_name = row[RPT_COL_CAMPAIGN]
# Skip processing if campaign name does not contain the placement key
if PLACEMENT_KEY not in campaign_name:
continue
value = get_value_before_dash(campaign_name)
# Only tag if it's different than the existing tag
if pd.notna(value):
outputDf.at[index, BULK_COL_NUMERO_PROJET] = value
else:
outputDf.at[index, BULK_COL_NUMERO_PROJET] = np.nan
# Only include non-empty tags in bulk
outputDf = outputDf.dropna(subset=[BULK_COL_NUMERO_PROJET])
# Remove extra whitespace from campaign name that breaks Preview
outputDf[RPT_COL_CAMPAIGN] = outputDf[RPT_COL_CAMPAIGN].str.strip()
if not outputDf.empty:
print("outputDf", outputDf.head().to_string())
else:
print("Empty outputDf")
Post generated on 2025-03-11 01:25:51 GMT