Script 1349: Dimension Update Subregion

Purpose:

Automates the process of populating subregion dimensions in a dataset based on campaign name values.

To Elaborate

The Python script is designed to automate the task of assigning subregion dimensions to campaigns based on their names. This is particularly useful in marketing or sales environments where campaigns are often named according to specific regions or subregions. The script processes a dataset, identifies campaigns that do not yet have a subregion assigned, and uses predefined rules to determine the appropriate subregion based on the campaign name. This helps ensure consistency and accuracy in data reporting and analysis by systematically categorizing campaigns into their respective subregions.

Walking Through the Code

  1. Data Preparation
    • The script begins by identifying the primary data source, which contains columns for campaign, account, and subregion information.
    • It then filters this data to create a new dataframe (filteredDf) that includes only those rows where the subregion is not already populated.
  2. Subregion Determination
    • A function named determine_subregion is defined to map campaign names to subregions. This function uses a series of conditional checks to match the beginning of campaign names with predefined subregion codes.
    • The function is applied to each campaign name in the filtered dataframe to determine the corresponding subregion.
  3. Output Generation
    • The results of the subregion determination are stored in a new dataframe (outputDf), which contains only the rows that were initially filtered.
    • The script concludes by printing a preview of the updated dataframe, showcasing the newly populated subregion values.

Vitals

  • Script ID : 1349
  • Client ID / Customer ID: 1306913420 / 60268008
  • Action Type: Bulk Upload
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Subregion
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Kyle Perkins (kyle.perkins@genesys.com)
  • Created by Kyle Perkins on 2024-08-28 17:36
  • Last Updated by Kyle Perkins on 2024-08-28 17:56
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
##
## name: Dimension Update - Subregion
## description:
##  Automates populating subregion dimensions based on campaign name values.
## 
## author: 
## created: 2024-08-28
## 

today = datetime.datetime.now(CLIENT_TIMEZONE).date()

# Primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_SUBREGION = 'Subregion'  # Reference to the subregion column

# Output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_SUBREGION = 'Subregion'  # Output subregion column

# Define the function that mimics the Excel formula to determine subregion
def determine_subregion(campaign_name):
    campaign_name = campaign_name.lower()  # Convert to lowercase to ignore case
    if campaign_name.startswith("apac_anz"):
        return "APAC - ANZ"
    elif campaign_name.startswith(("apac_all", "apac_en")):
        return "APAC - ALL"
    elif campaign_name.startswith("apac_asia"):
        return "APAC - ASIA"
    elif "apac_cn" in campaign_name:
        return "APAC - CN"
    elif campaign_name.startswith(("apac_in", "apac-in")):
        return "APAC - IN"
    elif campaign_name.startswith("apac_jp"):
        return "APAC - JP"
    elif campaign_name.startswith("apac_kr"):
        return "APAC - KR"
    elif campaign_name.startswith("apac_sea"):
        return "APAC - SEA"
    elif campaign_name.startswith("apac_taiwan"):
        return "APAC - Taiwan"
    elif campaign_name.startswith(("apac_th", "apac_thai")):
        return "APAC - THAI"
    elif campaign_name.startswith("emea_cz"):
        return "EMEA - CZ"
    elif campaign_name.startswith(("emea_de", "mk_de", "de -", "de|")):
        return "EMEA - DE"
    elif campaign_name.startswith("emea_dk"):
        return "EMEA - DK"
    elif campaign_name.startswith("emea_es"):
        return "EMEA - ES"
    elif campaign_name.startswith("emea_fi"):
        return "EMEA - FI"
    elif campaign_name.startswith(("emea_fr", "fr -")):
        return "EMEA - FR"
    elif campaign_name.startswith("emea_gr"):
        return "EMEA - GR"
    elif campaign_name.startswith("emea_il"):
        return "EMEA - IL"
    elif campaign_name.startswith("emea_it"):
        return "EMEA - IT"
    elif campaign_name.startswith("emea_me"):
        return "EMEA - ME"
    elif campaign_name.startswith("emea_nl"):
        return "EMEA - NL"
    elif campaign_name.startswith("emea_no"):
        return "EMEA - NO"
    elif campaign_name.startswith("emea_pl"):
        return "EMEA - PL"
    elif campaign_name.startswith("emea_roemea"):
        return "EMEA - RoEMEA"
    elif campaign_name.startswith("emea_ro"):
        return "EMEA - RO"
    elif campaign_name.startswith("emea_ru"):
        return "EMEA - RU"
    elif campaign_name.startswith("emea_se"):
        return "EMEA - SE"
    elif campaign_name.startswith("emea_tr"):
        return "EMEA - TR"
    elif campaign_name.startswith(("emea_uk", "mk_uk", "mpx_uk", "uk -")):
        return "EMEA - UK"
    elif campaign_name.startswith("emea_za"):
        return "EMEA - ZA"
    elif campaign_name.startswith(("latam_all", "latam_an", "latam_andean")):
        return "LATAM - NOLA"
    elif campaign_name.startswith(("latam_co", "latam_cr", "latam_ec", "latam_peru", "latam_nola")):
        return "LATAM - NOLA"
    elif campaign_name.startswith(("latam_sola", "latam_ar", "latam_chile", "latam_cl", "latam_sc")):
        return "LATAM - SOLA"
    elif campaign_name.startswith("latam_br"):
        return "LATAM - BR"
    elif campaign_name.startswith("latam_mx"):
        return "LATAM - MX"
    elif campaign_name.startswith(("latam_es", "latam_pt")):
        return "LATAM - ES (Legacy)" if "latam_es" in campaign_name else "LATAM - PT (Legacy)"
    elif campaign_name.startswith(("na_na", "na_ca", "dx_us", "bold_us")):
        return "NORAM - NA"
    else:
        return ""

# Filter inputDf to skip rows where subregion is already populated and copy rows to new filteredDf dataframe
filteredDf = inputDf[~inputDf[RPT_COL_SUBREGION].notna()].copy()

# Initialize outputDf so that it only includes rows that have been filtered
outputDf = filteredDf

# Apply the function to each row in the filtered dataframe and set result as the output dataframe value
outputDf[BULK_COL_SUBREGION] = filteredDf[RPT_COL_CAMPAIGN].apply(determine_subregion)

# Printing the result
print(tableize(outputDf.head()))

Post generated on 2025-03-11 01:25:51 GMT

comments powered by Disqus