The EW-Shopp toolkit aims at helping companies operating in eCommerce, Retail and Marketing industries improve their efficiency and competitiveness by supporting weather and event-based integration and analytics of data collected by companies during the shopper’s journey, which track sales, consumer behavior, and performance of marketing campaigns.
The EW-Shopp toolkit is an open source software ecosystem capable of managing data in tabular format and of generating linked data to be used for analytics and visualization. The EW-Shopp toolkit covers the three main activities commonly identified in a data science project: data preparation and enrichment, data visualization and data analysis.
- Data preparation and enrichment. This data processing step can be carried out by means of three tools, namely DataGraft, ASIA, and ABSTAT. DataGraft and its data transformation tool Grafterizer provide data management, data cleaning, modelling, preparation and graph transformation functionalities using user-specified transformations. ASIA is a tool for the semantic enrichment of data available in tabular formats, thus helping users in integrating business data with events and weather data. Semantic reconciliation algorithms are integrated into a user interface to help users map the data schema to shared vocabularies and ontologies and link data values to shared systems of identifiers. Data enrichment widgets exploit these links to shared systems of identifiers to ease the extraction of additional data from third-party sources and their fusion into the original tabular data. ABSTAT is integrated with DataGraft to deliver end-to-end table manipulation and semantic enrichment functionalities. ABSTAT is a tool to profile knowledge graphs represented in RDF based on linked data summarization mechanisms. The profiles extracted by ABSTAT describe the content of the knowledge graphs using abstraction (schema-level patterns) and statistics. The profiles help users understand the content of the knowledge graphs used in the platform (e.g., linked product data), support ASIA’s semantic reconciliation algorithms, and provide data quality insights.
- Data visualization and navigation. The suite is based on the KnowAge tool and implements this functionality by providing tools for producing high-quality reports of the transformed, enriched and analyzed information obtained from the platform.
- Learn more about EW-Shopp Data Visualization and Navigation tools.
- Data Analysis. This functionality is implemented using QMiner data analytics platform and is opened to other solutions. Within the toolkit, QMiner provides functionality of learning models from historic datasets and use them for prediction on new data points. It implements a comprehensive set of techniques for supervised, unsupervised and active learning which support bot structured and unstructured data.
- Learn more about EW-Shopp Data Analytics tools.
Main application areas are the one tested in the EW-Shopp project: eCommerce, Retail, Customer Relationship Management (CRM), Digital Marketing, IoT. However, the toolkit can work in many other application domains, including societal & economical analysis.
Many companies operating in domains like eCommerce, Retail, Customer Relationship Management (CRM) and Digital Marketing, or providing IT services to companies operating in these domains, collect large amount of data about customers at different touch points across the so-called consumer journey.
Customers get informed about products through online and offline advertising campaigns, e.g., through ads displayed on their browsers or calls from contact centers, and by proactively searching for information online and offline, e.g., using comparison shopping engines or visiting physical stores. Customers make their purchase decision and may search for further assistance through customer care departments.
Data analytics provide a powerful means to gain customer insights, but their effectiveness depends on the data they are fed with. Data collected by individual companies often provide a partial view on the customer journey and the analytical models they use often neglect factors that have an important impact on customers’ decisions. EW-Shopp aimed at supporting companies to gain deeper customer insights by helping them develop analytical services that use rich models, which also consider events that impact on customer decisions, such as weather, marketing campaigns, holidays, etc.
The developed toolkit has the objective of providing components to facilitate all the data processing steps required to develop reliable weather and event-based data-driven services, including data preparation and enrichment, predictive and descriptive analytics, and visualization. In addition, the project has the objective of demonstrating the effectiveness of weather and event-based data analytics for developing valuable business services and the usefulness of the delivered toolkit.
Benefits from using the toolkit can be summarized as follows:
- Data preparation and enrichment: native support for semantic data integration (construction of RDF knowledge graph, usage of RDF knowledge graphs for data enrichment, semantic matching algorithms configured through user-friendly interfaces); native support for API-based access to third party data and storage (support for pre-fetching of large amount of data from third-party sources available through APIs); cloud-readiness and scalability (native support for batch execution; support for deployment on scalable could infrastructure); modularity and flexibility (support for custom data manipulation functions; possibility to add new data reconciliation and extension services).
- Data visualization: open source suite with KnowAge; connection to several databases; user-friendly interfaces for configuring the visualizations.
- Data analytics toolset: open source scripts that users can use as starting point to develop their analytics on different business scenarios; services built on top of the QMiner data analytics platform for building analytic models using events and weather data; support data ingestion, transformation, as well as model building and deployment (as a REST service); weather and event data can be obtained using the weather data API and the Media Attention API; support for analytics on keywords based on FastText distributed representations (similarity and clustering).
The toolkit provides, to the best of our knowledge, the first open source set of components that are developed to support weather and event-based analytics.
The data preparation and enrichment component, which results from the integration of DataGraft and ASIA, provides the first solution in the market that supports semantic data enrichment on large data volumes, thus solving known issues of the few alternatives in the market, such as OpenRefine. The solution is the only one able to provide two different but related functionalities like transformation of tabular data into RDF knowledge graphs and tabular data extension. In addition, compared to other UI-based data enrichment frameworks like Knime, the tool use a unique table-first approach to data enrichment, where the users models the transformations by annotating the table and inspecting the results of the specified transformation on a data sample. For more information about the novelty of the tool see also our ESWC2019 tutorial on semantic data enrichment for data scientists at https://ew-shopp.github.io/eswc2019-tutorial/