Data Analysis optimization using SPSS / Python scripts
-
Business case. The client conducts tracking survey and analysis data using specific algorithms (branded solutions) for the calculation of specific indexes and parameters. The algorithm includes numerous data transformations, new variables creation, recalculations, and standardizations.
Client’s needs. Optimization of time for data processing and data analysis. Automation of process for avoiding manual mistakes
Solution proposed. The client used SPSS for analysis and planned to use it in the future. All the data transformation, operations with data and calculation of indexes and parameters were proposed as an SPSS script. Considering that some data transformations and recalculations requested the creation of new variables, deleting of variables, or saving some data in the external files for further calculation some procedures which cannot be automated by SPSS syntax were written in Python.
Results. The complex SPSS script which included automation and data transformations using Python saved more than 60% of the time required for data analysis.
Power BI Dashboarding
-
Business case. The client is a consultancy company focused on intangible assets. Client’s key product – reports, based on large quantitative panel surveys. Reporting was traditionally done in PowerPoint and for some projects included more than 100 slides of charts and tables. A large number of manually made slides increased the risk of mistakes and discrepancies.
Client’s needs. Optimization of reporting and decreasing of “human factor” in quality and accuracy of deliverables.
Solution proposed. Reports were made using the Power BI dashboarding keeping all the visualizations format and formatting. Data was taken from the SPSS file, exported to the Excel tables, and then used in the Power BI platform. Data in dashboards corresponded to the client’s brand book and looked the same as the initial PowerPoint slides. For the tracking surveys, the client gets an opportunity to upload new waves’ data automatically without any risk of manual mistakes.
Results. Power BI dashboards saved more than 70% of the time required for reports development and protect the client from mistakes and discrepancies based on the “human factor” during the reporting process.
Automation of OEs coding for research projects
-
Business case. The client is a consultancy company that conducts numerous tracking surveys. Each survey contains numerous open-ends questions (OEs) which are coded using the same Code Lists. Sometimes specific open-ends questions can be added in one or several waves. Manual coding of OEs requires a lot of time and contains some risks of mistakes, especially for OEs which include sentences or phrases
Client’s needs. Optimization of time required for both recurrent and new open ends questions, decreasing of “human factor” in quality and accuracy of coding. Possibility to automate the coding of “simple” open-ends (like brand attributes) and difficult open-ends which contain sentences or phrases
Solution proposed. Client use SPSS software for data analysis and all the data files are provided by the fieldwork supplier in SPSS. That is why code for OEs coding automation was developed using SPSS and Python. The client should use the standardized format of Code lists in Excel. Script/program compares all the answers in open end variable and in case of matches code them into a new variable (or variables if multiple responses were allowed). After that, all non-coded answers are provided in the Excel output. If the client wants to extend the variables list he adds the new codes to the Code list and re-runs the script.
Results. The script runs in the SPSS, where the client indicates the variable which should be coded and the path to the code list. After that coding is done automatically. It allows applying the script to different open-ends questions, both pre-existing and new. It works efficiently for simple and difficult (containing sentences or phrases) open-ends questions because it depends on the code list only and doesn’t require any changes to the script itself.
Developing new client segmentations
-
Business case. The client has a special methodology that creates complicated multi-factor-based very customized segments for his different clients. This methodology really depends on the data gathered and must be adjusted every time.
Client’s needs. Making the custom adjustments of the segmentation coefficients depending on the data structure and creating new SPSS syntaxes for each segment.
Solution proposed. Running the cluster and discriminant analysis for each wave of every project, creating the new custom codes that fit into the client’s methodology, and simultaneously describing the current structure of the key defining data variables in the current dataset. Also, there should be an additional set of Excel tables to check these newly formed clusters.
Results. The set of complex SPSS scripts designed for each cluster type and for each situation and the set of Excel checking tables provided to the client, save his own time and allow him to start his own interpretation work.
PowerPoint reports outsourcing
-
Business case. The client has a lot of client reports on a monthly/quarterly / semester / yearly basis in PowerPoint. Essentially, he usually spends a lot of time on the data processing and the data population of the slides and usually has a very small amount of time left for the most advanced part (conclusions and recommendations for his client).
Client’s needs. Outsourcing the most of time-consuming processes (Excel data output and the PowerPoint slides population), while he can concentrate on the real work with the clients, getting his needs and translating them into consulting/recommendations.
Solution proposed. Two main parts of the processes that were carried out by our team: data analysis of the raw data according to the client’s methodology and making the PowerPoint slides based on that data. The original template is being provided by the client according to his brand book and overall vision of the delivery (colors, types of analysis, and comparisons to include, etc.)
Results. The Excel output with the data tables and also the PowerPoint presentation can save a lot of client’s time and be provided faster than the usual client’s own workflow allows.
Creating of the complicated automatized excel dashboards
-
Business case. The client had an internal research of the corporate culture and evaluation of the employees (which was both self-evaluating and the evaluation of the colleagues). The company had a very complex multi-level structure, and in the end, many hundreds of staff should be evaluated, due to this complex structure, people who were on a bit higher level could have both their own team and their own supervisor.
Client’s needs. Each of those hundreds of employees on the long list should have its own report with their own results of the survey, comparisons between self-esteem and team-esteem, dynamics between waves – and those results should be not on the company level, but very customized for each person - on the level of this particular employee, his team, his own KPIs as the result.
Solution proposed. The bunch of Excel Macros were designed specifically for automatically populating these personalized reports. They automatically created those hundreds of new files and exported those results on the different levels depending on the person, his level, his team, and his department – thanks for the multiple Excel formulas embedded. Also, these reports were automatically designed in a client-friendly form, with all the colors, fonts, and other details that were needed.
Results. The hundreds of personalized Excel reports were made basically by automation and saved a lot of time for the client.
Survey links testing
-
Business case. The client has a number of custom multi-country, multi-company, and multi-stakeholder survey trackers which all are conducted through an online panel provider. They are being run usually on a monthly level and can have some changes every time (on different markets/languages, new questions or companies). Because of the complexity of the survey, the client usually cannot check all the survey branches in time till this survey should go live.
Client’s needs. Each of the changes that were made in the current wave should be checked in every language of the tracker (even in non-Latin scripts), also every existing condition between the questions should be checked – all the mistakes found after checking should be reported back to the panel provider and after that, double-checked when it’s fixed.
Solution proposed. The detailed SOP of all the checking processes, simplification, and organization of the checking of all the scripts and languages with the detailed Excel checking sheets, which the testing person is following each wave and testing all the possible changes.
Results. The client saved a lot of time using this type of outsourcing and this structured Excel checking plan which allowed him to find the maximum of possible mistakes in a short period of time before going live with the survey.
Quantification of the qualitative information and building the indices
-
Business case. The client has a bunch of qualitative surveys for different stakeholders: some of them are open-ended questions among the General Public or Employees, and another ones are the gathered media analysis (all the articles / social media which mention the client or its activities).
Client’s needs. These open-ends and media information should be quantified: presented as the numerical index which could correspond to the client’s usual methodology in its usual scale. It also should allow not only the numerical index itself but also the relative impact of the different topics. Apart from that, different articles/mentions in the media dataset should have a different weight, because there are more and less significant (for the client) media sources.
Solution proposed. The special procedure was designed, with the help of quantification of the open-ends as ‘strongly negative’, ‘slightly negative’, ‘neutral’, ‘slightly positive’, strongly positive’ (which were applied differently for the different topics). Later, for this quantified scale had a special recoding SPSS code, which allowed us to present those topics in a client-friendly scale; to weight the overall file according to the media significance; and also to define the impacts of every topic on the overall image.
Results. The client has a number of new indices that could be comparable within his methodology and KPIs and also which could give him a lot of additional insights for the stakeholders (i.e. media) that previously were just a separate qualitative dataset.
Modeling of the links between financial/economic and intangible indices
-
Business case. The client has a number of separately gathered over the years and separately stored different data about its multi-level diversified business (CSR spending, wages, media indices, investment level, the efficiency of the year plan by each sub-entity, macroeconomic tendencies, social climate, etc.). Also, it has a continuous reputation measurement as the KPI.
Client’s needs. The search for the hierarchical impact of all these multi-stored economic performance factors into the reputation as the final intangible KPI index.
Solution proposed. Three steps were proposed for solving this task: a) merging all this separate big data history from different departments and years into one combined data file with all the needed transformations and recordings; b) defining the redundant factors that don’t have a place/impact in the model and need to be ignored; c) creating the final regression model with the final list of factors.
Results. The client now has the hierarchy of the most impactful economic/financial factors in his reputation. Also, these separate regression models with the separate list of factors were developed not only on the overall company level but for the main sub-businesses of this company as well, so the client had a deeper look and the space for his own management steps.
Forecasting the company's need in the labor force based on pupils/students statistics in the region
-
Business case. The client is the owner of a number of large enterprises in different small towns across the country which historically were the economic strongholds for this profession and for these towns as well. With the time passing and the economy opening, there’s now a big question and a problem to involve the young pupils into this type of enterprise.
Client’s needs. Finding the main reasons that prevent youth from choosing this profession and enterprise in a small town, and also the main possible paths/methods to promote and boost the professional orientation in a certain direction.
Solution proposed. The combined study helps to dive into this question from different angles: a) special focus groups and in-depth interviews with the pupils and their parents from these small towns that unveiled the main concerns and reasons for professional orientation; b) the quantitative research among the high school pupils.
Results. The client has got a bunch of recommendations on how to build the promotion of their enterprises and profession in these towns, the main methods to boost the professional orientation in their direction, also the main concerns and points to develop/manage in order to maintain the good employer image in the towns.
Evaluation of the link between brand strength and reputation
-
Business case. The client has a long history of using the Brand Strength indices as the main KPI of his work performance evaluation. Due to recent events the company had, it began to question the role of reputation in the overall company landscape (this parameter had also existed in the monthly longitude study).
Client’s needs. Finding the proof that the statistical link between those two parameters (Brand Strength and Reputation) exists and finding the key points of influence.
Solution proposed. Since both parameters had been measured every month and both had enough history, three parts of the story were analyzed: a) time series research; b) building the correlations between these parameters and also different elements of reputation; c) adding the monthly longitude comparisons to the monthly charts in the report visualization, so the client really could see the impact and the effect visually, also even if that effect is lagging in time.
Results. The client has statistical proof that this link exists. Moreover, the effect can be not immediate, but the results can come as the next month/month's influence. Also, he’s got the hierarchy of which elements of reputation have the biggest link and the biggest possible effect, which are more immediate, and which effects are more lagging in time.
Evaluation of the link between NPS and reputation
-
Business case. The client has a long history of using the NPS indices as the main KPI of his work performance evaluation. It also has a vast amount of internal clusters of the GenPub, and it tracks the company reputation parameters as well.
Client’s needs. Finding the proof that the statistical link between those two parameters (Brand Strength and Reputation) exists and investigating this link among different clusters and sub-groups that can be potentially affected the most.
Solution proposed. The regression analysis was built for the overall company level and for the clusters. Additionally, for illustration purposes, the NPS was counted separately among different reputation levels, and the same thing among different sub-groups. All this information was also translated visually into slides for the client.
Results. The client has statistical proof that this link exists. Moreover, he had discovered the most affected sub-groups that should have the most attention, so the whole consulting story with the strongest and weakest points for the management was developed.
Evaluation of the link between share value and reputation
-
Business case. The client has a special online platform that constantly tracks the reputation (and other intangible assets evaluations) in time. Automatically pre-designed, it’s usually not very flexible, so normally tracks only the original parameters and doesn’t usually extend much in case of a need for additional insights.
Client’s needs. The client wants to add the open-source information to his online platform – to show not only the intangible assets dynamics in the visual dashboards but also the financial indices like shares’ prices, so he could constantly compare the link and impact between those two parameters.
Solution proposed. The solution was split into two parts: a) additional coding that allowed the API connection with the open-source data and adding it to the visual dashboards; b) exporting these two parameters and running the correlation between them, proving that the link exists.
Results. The client has undergone a significant upgrade of its visualization platform, gaining new insights from these added dynamics and comparisons. Also, he’s got statistical proof that this link between shares and reputation exists, so he could operate with this information in building his narrative and possible future reports.
Developing of the measurement methodology for the trust in the banks
-
Business case. The client (a bank) has begun to develop reputation management: after a year of a reputation crisis, it needed to restore and manage the overall view of the company and be perceived as a good member of society.
Client’s needs. The bank wants to make a specifically designed custom reputation model of attributes that could be relevant to its own business and to its industry. These attributes should be later part of the evaluation research and parts of the KPIs.
Solution proposed. The solution was split into three stages: a) concluding a set of in-depth interviews with the banking community, so the prevalent and most important topics/parameters were gathered; b) a qualitative research with these criteria; c) statistical validation of the results, their correlation, regression coefficients that confirmed the integrity of this newly created model and its relevance to the company’s business and potential problems.
Results. The client has got a good approved custom methodology, a set of specifically tailored attributes that became a basis for the next research, the insights, the reporting, and the future reputation management of the bank.
Identification of the predictors which impact on the company’s reputation with the help of path analysis
-
Business case. The client had a long story of reputation management and has worked with the different attributes for some time. It had a specific model designed by him that needed a specific methodology re-evaluation, not only its consistency but also the causality, the direction of the impacts inside the model, etc.
Client’s needs. The model needs not only some standard regression coefficients, but the elements of the model also need to be evaluated on how they work together if all of these elements bring the result that was in the initial intentions, and what is the direction of the impacts between the model components.
Solution proposed. The solution was made through a Path Analysis module in the SPSS: special combinations of the existing attributes were evaluated one by one. Some of the attributes were considered a bit redundant, and some other attributes fell into the wrong category/dimension and were telling a bit different story for a report that was needed. The direction of the impacts was also evaluated.
Results. The client has got a special chart – a scheme with all the elements of the model and the links between them (in the view of special coefficients and arrows). Now he could point out the most and the least influential factors, and also could rationalize his own model and in the end, reshape it for future use and for his KPIs.
Identification of the impacts on the charity / NGO donation intentions
-
Business case. The client has worked with different NGOs and other charity-related companies, gathering a large number of data concerning their support, awareness, brand strength, willingness to donate, etc. One of the charities had a specific request to investigate the impact of some key variables.
Client’s needs. The main goal of the task was to find a group of variables that could have a joint impact on the support of a certain charity (and also along with the variables connected to support, like awareness, recommendations, charity engagement, and the closest cases of the disease among the person’s close circle).
Solution proposed. The solution was done in several stages: a) building initial cross-correlations and finding the groups of variables that have the most powerful impact on the dependent variables; b) building the initial regression models with the selected variables and evaluation of these models; c) defining of the main demo splits impacting on the dependent variables; d) model reduction and optimization, developing of the "key predictors" lists for the model; e) comparing the key predictors list between models (for demo segments) and general model, defining common and unique predictors.
Results. The client has a set of impactful factors that work on the overall level and on the level of the key demos. The other result was also a list of redundant variables that are present in the study but don’t give the added values for the reporting and making the new insights. In the end, the client had a good regression model with all the coefficients to speak to the charity and build recommendations.
Audience segmentation based on value orientations
-
Business case. The client, a consultancy company, was developing a communication and reputation strategy for a large telecommunication company, which is one of the leading providers of mobile communications and the Internet. For this purpose, a large quantitative survey was conducted. One of the approaches used for data analysis was audience segmentation.
Client’s needs. The client wanted to try more advanced segmentation criteria, not based on demographical groups or products/services used.
Solution proposed. Two criteria of segmentation were proposed to the client – segmentation based on social class and segmentation based on values.
For segmentation based on social class a set of specific indicators was developed. They included questions about income, education, property, property rights, savings, and lifestyle. Respondents who corresponded to certain criteria were assigned to low / middle / upper class groups.
For segmentation based on values statements from the Schwartz Value Survey methodology were used. Then two types of analysis were done – “a priory” analysis with identification of standard segments described by Schwartz and a combination of Factor and Cluster analysis for identification of “post hoc” segments, more general and consolidated than the original one.
Results. Segmentations with the highest explanatory potential – based on social class and “post hoc” segmentation based on Schwartz Value Survey questions were used by the client for prioritization of audiences, identification of strategical targets, and development of relevant communication instruments.
Corporate culture diagnostics using the custom segmentation
-
Business case. The Client (bank) initiated a corporate culture diagnostic project to increase employee loyalty and reduce staff turnover. A quantitative employee survey was conducted to measure the current situation in the bank and to develop an action plan for increasing the employee’s loyalty.
Client’s needs. Understanding the heterogeneity of employees, significant differences in the employees' loyalty, and distinctive characteristics of different departments the Client wanted to distinguish groups, or segments of employees that are different, describe and characterize these groups, understand the degree of burnout of employees in each of them, key instruments and messages for communication with each of groups.
Solution proposed. Standard approaches and typologies of corporate culture like Cameron and Quinn, Denison, and Sonnenfeld didn’t cover all the client’s needs and requests. That is why for “post hoc” identification of employee groups (clusters) extended list of characteristics was developed. During the data processing with the use of Factor analysis and Cluster analysis, specific customized segments of employees were identified. For each of the segments, detailed characteristics of their peculiarities, main motives, barriers, and pain points were described.
Results. Information on segments of employees, the size of these segments, and their characteristics allowed the bank to develop a two-year HR strategy with the road map and action plan for each of the segments, making the internal communication relevant and actual for employees, decreasing the staff turnover.