Script 1067: Script Campaign Numero de Projet
Purpose
The script extracts a project number from a campaign name and assigns it to a specific dimension called ‘Numero de Projet’.
To Elaborate
The Python script is designed to parse campaign names and extract a project number, which is then assigned to a dimension labeled ‘Numero de Projet’. The project number is identified as the tag that appears after a hyphen (‘-‘) in the campaign name. The script processes each row of the input data, checks for the presence of the hyphen, and extracts the tag if it exists. If the extracted tag is different from the existing tag in the ‘Numero de Projet’ column, it updates the column with the new tag. The script ensures that only non-empty tags are included in the final output, and it also removes any extra whitespace from the campaign names to maintain data consistency.
Walking Through the Code
- Configurable Parameters:
- The script begins by defining a configurable parameter
PLACEMENT_KEY
, which is set to a hyphen (‘-‘). This key is used to identify the position of the project number in the campaign name. - The primary data source is defined as
inputDf
, which is sourced from a dictionarydataSourceDict
.
- The script begins by defining a configurable parameter
- Function Definition:
- A function
get_tag_from_campaign_name
is defined to extract the tag from the campaign name using a regular expression. It searches for a pattern that matches a hyphen followed by word characters and returns the extracted tag.
- A function
- Data Processing:
- The script creates a copy of the input data frame
inputDf
tooutputDf
for processing. - It iterates over each row of the input data frame, checking if the campaign name contains the
PLACEMENT_KEY
. - If the key is present, the script extracts the tag using the defined function and compares it with the existing tag in the ‘Numero de Projet’ column.
- The script creates a copy of the input data frame
- Updating and Cleaning Data:
- If the extracted tag is different from the existing tag, the script updates the ‘Numero de Projet’ column with the new tag. Otherwise, it assigns a NaN value.
- The script filters out rows with empty tags and removes extra whitespace from the campaign names to ensure clean data output.
- Output:
- Finally, the script checks if the output data frame
outputDf
is empty and prints the first few rows if it contains data, ensuring that the processed information is correctly displayed.
- Finally, the script checks if the output data frame
Vitals
- Script ID : 1067
- Client ID / Customer ID: 1306927809 / 60270355
- Action Type: Bulk Upload
- Item Changed: Campaign
- Output Columns: Account, Campaign, Numero de Projet
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
- Created by Grégory Pantaine on 2024-05-10 16:33
- Last Updated by Grégory Pantaine on 2024-05-15 12:25
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
## name: Script - Campaign - Numero de Projet
## description:
## Parse Campaign Name and pick out project number into a dimension 'Numero de Projet'.
## Tag appears after '-' in campaign name.
##
## Copied by Grégory Pantaine
## created: 2024-05-10
########### Configurable Params - START ##########
PLACEMENT_KEY = '-'
# Primary data source and columns
inputDf = dataSourceDict["1"]
# Output columns and initial values
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_NUMERO_PROJET = 'Numero de Projet'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_NUMERO_PROJET = 'Numero de Projet'
# Function to extract tag from campaign name
def get_tag_from_campaign_name(campaign_name):
regex_pattern = r"-(\w+)"
placement_match = re.search(regex_pattern, campaign_name)
if placement_match:
placement_value = placement_match.group(1)
# print("Placement value:", placement_value)
return placement_value.strip()
else:
print("Placement value not found: " + campaign_name)
# Return the entire campaign name if 'pl:' is not present
return campaign_name.strip()
# Copy all input rows to output
outputDf = inputDf.copy()
# Loop through all rows
for index, row in inputDf.iterrows():
existing_tag = row[RPT_COL_NUMERO_PROJET]
campaign_name = row[RPT_COL_CAMPAIGN]
# Skip processing if campaign name does not contain the placement key
if PLACEMENT_KEY not in campaign_name:
continue
tag = get_tag_from_campaign_name(campaign_name)
# Print campaign and tag information
# print("Campaign [%s] => Tag [%s]" % (campaign_name, tag))
# Only tag if it's different than the existing tag
if (len(tag) > 0) & (tag != existing_tag):
outputDf.at[index, BULK_COL_NUMERO_PROJET] = tag
else:
outputDf.at[index, BULK_COL_NUMERO_PROJET] = np.nan
# Only include non-empty tags in bulk
outputDf = outputDf.dropna(subset=[BULK_COL_NUMERO_PROJET])
# Remove extra whitespace from campaign name that breaks Preview
outputDf[RPT_COL_CAMPAIGN] = outputDf[RPT_COL_CAMPAIGN].str.strip()
if not outputDf.empty:
print("outputDf", tableize(outputDf.head()))
else:
print("Empty outputDf")
Post generated on 2024-11-27 06:58:46 GMT