Script 1599: Script Auto Tag Country & Friendly Name

Purpose:

The script extracts country codes and friendly names from campaign names in a dataset.

To Elaborate

The Python script is designed to process a dataset containing campaign information and extract specific details from the campaign names. It focuses on identifying and separating the country code and a friendly name from each campaign name. The script assumes that the campaign names follow a structured format, where these details can be extracted based on their position within the name. The extracted information is then used to populate new columns in the output dataset, which can be used for further analysis or reporting. The script ensures that only rows with successfully extracted data are retained, thereby maintaining the integrity and relevance of the output.

Walking Through the Code

  1. Data Preparation:
    • The script begins by importing necessary libraries and defining the primary data source, which is a DataFrame containing campaign data.
    • It specifies the columns of interest from the input data, such as ‘Campaign’, ‘Campaign ID’, and ‘Account’.
  2. Function Definitions:
    • Two functions, extract_country and extract_friendlyname, are defined to parse the campaign name. They split the name by underscores and extract the country code and friendly name based on their positions.
  3. Data Processing:
    • An output DataFrame is created, copying relevant columns from the input data.
    • The extraction functions are applied to the ‘Campaign’ column to populate new columns for country codes and friendly names.
  4. Data Cleaning:
    • Rows where extraction fails (resulting in None values) are removed to ensure the output only contains valid data.
  5. Output:
    • The script prints the first few rows of both the input and output DataFrames for verification purposes.

Vitals

  • Script ID : 1599
  • Client ID / Customer ID: 1306927959 / 60270403
  • Action Type: Bulk Upload
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Country_Dim, FriendlyName_Dim, Campaign ID
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
  • Created by Grégory Pantaine on 2024-12-23 21:27
  • Last Updated by Grégory Pantaine on 2025-01-23 14:25
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
##
## name: Script - Auto Tag - Country & Friendly Name
## description: Extracts country code and friendly name from campaign names
## 
## author: 
## created: 2024-12-23
## 

import datetime
import pandas as pd

today = datetime.datetime.now(CLIENT_TIMEZONE).date()

# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_CAMPAIGN_ID = 'Campaign ID'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN_STATUS = 'Campaign Status'
RPT_COL_PUB_COST = 'Pub. Cost $'
RPT_COL_FRIENDLYNAME_DIM = 'FriendlyName_Dim'
RPT_COL_COUNTRY_DIM = 'Country_Dim'

# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_COUNTRY_DIM = 'Country_Dim'
BULK_COL_FRIENDLYNAME_DIM = 'FriendlyName_Dim'
BULK_COL_CAMPAIGN_ID = 'Campaign ID'

# Function to extract country code
def extract_country(campaign_name):
    try:
        parts = campaign_name.split('_')
        if len(parts) >= 4:
            return parts[3][:2]  # First two characters after third underscore
        return None
    except:
        return None

# Function to extract friendly name
def extract_friendlyname(campaign_name):
    try:
        parts = campaign_name.split('_')
        if len(parts) >= 4:
            return parts[3][2:]  # Everything after first two characters after third underscore
        return None
    except:
        return None

# Create output dataframe with required columns
outputDf = inputDf[[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN, RPT_COL_CAMPAIGN_ID]].copy()
outputDf.columns = [BULK_COL_ACCOUNT, BULK_COL_CAMPAIGN, BULK_COL_CAMPAIGN_ID]

# Apply the extraction functions separately
outputDf[BULK_COL_COUNTRY_DIM] = outputDf[BULK_COL_CAMPAIGN].apply(extract_country)
outputDf[BULK_COL_FRIENDLYNAME_DIM] = outputDf[BULK_COL_CAMPAIGN].apply(extract_friendlyname)

# Remove rows where extraction failed (where either value is None)
outputDf = outputDf.dropna(subset=[BULK_COL_COUNTRY_DIM, BULK_COL_FRIENDLYNAME_DIM])

# Debug print first few rows
print(tableize(inputDf.head()))
print("\nOutput DataFrame:")
print(tableize(outputDf.head()))

Post generated on 2025-03-11 01:25:51 GMT

comments powered by Disqus