Script 1067: Script Campaign Numero de Projet

Purpose

Parse campaign names and extract project numbers into a dimension called “Numero de Projet”.

To Elaborate

The Python script solves the problem of extracting project numbers from campaign names. It takes a primary data source with campaign names and outputs a new dataframe with the extracted project numbers in a dimension called “Numero de Projet”. The project numbers are extracted from the campaign names by looking for a tag that appears after a ‘-‘ character. The script also handles cases where the campaign name does not contain the ‘-‘ character or the tag.

Walking Through the Code

  1. The script starts by defining a configurable parameter PLACEMENT_KEY which is set to ‘-‘.
  2. It then defines the input dataframe inputDf which is the primary data source containing the campaign names.
  3. Next, it defines the output columns and their initial values.
  4. The script defines a function get_tag_from_campaign_name which takes a campaign name as input and uses a regular expression pattern to extract the tag after the ‘-‘ character.
  5. The script creates a copy of the input dataframe as the output dataframe outputDf.
  6. It then loops through each row in the input dataframe.
  7. For each row, it retrieves the existing tag and campaign name.
  8. If the campaign name does not contain the PLACEMENT_KEY, it skips processing for that row.
  9. It calls the get_tag_from_campaign_name function to extract the tag from the campaign name.
  10. If the extracted tag is different from the existing tag and not empty, it updates the corresponding row in the output dataframe with the extracted tag.
  11. If the extracted tag is empty, it sets the corresponding row in the output dataframe to NaN.
  12. After processing all rows, it drops rows with empty tags from the output dataframe.
  13. It removes any extra whitespace from the campaign names in the output dataframe.
  14. Finally, it checks if the output dataframe is empty and prints either the tableized head of the dataframe or a message indicating that the output dataframe is empty.

Vitals

  • Script ID : 1067
  • Client ID / Customer ID: 1306927809 / 60270355
  • Action Type: Bulk Upload (Preview)
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Numero de Projet
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
  • Created by Grégory Pantaine on 2024-05-10 16:33
  • Last Updated by Grégory Pantaine on 2024-05-10 16:37
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
## name: Script - Campaign - Numero de Projet
## description:
## Parse Campaign Name and pick out project number into a dimension 'Numero de Projet'.
## Tag appears after '-' in campaign name.
## 
## Copied by Grégory Pantaine
## created: 2024-05-10

########### Configurable Params - START ##########
PLACEMENT_KEY = '-'

# Primary data source and columns
inputDf = dataSourceDict["1"]

# Output columns and initial values
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_NUMERO_PROJET = 'Numero de Projet'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_NUMERO_PROJET = 'Numero de Projet'

# Function to extract tag from campaign name

def get_tag_from_campaign_name(campaign_name):
    regex_pattern = r"-(\w+)"

    placement_match = re.search(regex_pattern, campaign_name)
    if placement_match:
        placement_value = placement_match.group(1)
        # print("Placement value:", placement_value)
        return placement_value.strip()
    else:
        print("Placement value not found: " + campaign_name)
    
    # Return the entire campaign name if 'pl:' is not present
    return campaign_name.strip()

# Copy all input rows to output
outputDf = inputDf.copy()

# Loop through all rows
for index, row in inputDf.iterrows():
    existing_tag = row[RPT_COL_NUMERO_PROJET]
    campaign_name = row[RPT_COL_CAMPAIGN]
    
    # Skip processing if campaign name does not contain the placement key
    if PLACEMENT_KEY not in campaign_name:
        continue

    tag = get_tag_from_campaign_name(campaign_name)

    # Print campaign and tag information
    # print("Campaign [%s] => Tag [%s]" % (campaign_name, tag))

    # Only tag if it's different than the existing tag
    if (len(tag) > 0) & (tag != existing_tag):
        outputDf.at[index, BULK_COL_NUMERO_PROJET] = tag
    else:
        outputDf.at[index, BULK_COL_NUMERO_PROJET] = np.nan

# Only include non-empty tags in bulk
outputDf = outputDf.dropna(subset=[BULK_COL_NUMERO_PROJET])

# Remove extra whitespace from campaign name that breaks Preview
outputDf[RPT_COL_CAMPAIGN] = outputDf[RPT_COL_CAMPAIGN].str.strip()

if not outputDf.empty:
    print("outputDf", tableize(outputDf.head()))
else:
    print("Empty outputDf")

Post generated on 2024-05-15 07:44:05 GMT

comments powered by Disqus