Technical Papers

CASCON 2018 technical papers should be original research papers and be at most 12 pages long in ACM format(style ACM_SigConf). CASCON 2018 experience reports share experience and lessons gained from the use of technology that can be useful to others and should be at most 12 pages in ACM format. CASCON 2018 position papers share vision, highlight technology gaps, or discuss emerging advances and should be at most 8 pages in ACM format. CASCON 2018 will not accept submissions that have been previously published, are in press, or have been submitted elsewhere. Accepted technical papers will be included in the conference proceedings published by CASCON and included in the ACM Digital Library.

Best Paper Awards

Two paper awards - Best Paper and Best Student Paper - will recognize the best technical contributions of the event in terms of originality, clarity, and potential impact. To be eligible for the Best Student Paper award, a student(s) must have primarily authored the paper, must have been a student(s) at the time the work was done, and must have carried out the work described; only the student author(s) receive this award.

Congratulations to the Authors!

Accepted Papers

This year we received 68 full papers and 23 position papers.
We accepted 23 full papers and 10 position papers.
Acceptance rate for full papers was 33.8% and for position papers was 43.5%.

Thank you for your submissions. See you all at CASCON 2018.

A DevOps Framework for Quality-Driven Self-Protection in Web Software Systems

Nasim Beigi Mohammadi , York University ; Marin Litoiu , York University ; Mahsa Emami-Taba , University of Waterloo ; Ladan Tahvildari , University of Waterloo ; Marios Fokaefs , Polytechnique Montreal ; Ettore Merlo , Polytechnique Montreal ; Iosif Viorel Onut , IBM Canada Ltd. ;

Modern software is developed, deployed and operates continuously. At the same time, cyberattacks are on the rise. The continuity of development and operations and the constant threat of attacks requires novel approaches to identify, analyze and address potential security vulnerabilities. In this continuous and volatile execution environment, factors like security, performance, cost and functionality may not be able to be guaranteed in the same degree at the same time. In this work, we propose a DevOps framework for security adaptation that enables the development and operations teams to collaborate and address security vulnerabilities. The proposed framework spans across the different phases of software (development, operations, maintenance) and considers all other factors (performance, cost, functionality), when deciding for security adaptations. We demonstrate the approach on a prototype tool that shows how teams work together to tackle security concerns.

A Survey of Ahead-of-Time Technologies in Dynamic Language Environments

Mark Thom , University of New Brunswick ; Gerhard Dueck , University of New Brunswick ; Kenneth Kent , University of New Brunswick ; Daryl Maier , IBM Canada Ltd. ;

Eclipse OMR is an open source collection of robust, reuseable components for the construction of production-ready compilers. Great progress has been made on JITBuilder, OMR's simplified interface to the compiler technology for building JIT compilers, but work on the planned interface for AOT compilation has only just begun. In this survey, we identify desirable characteristics for the design of OMR's AOT by examining how several prominent open source compilers implement AOT. We conclude by discussing the advantages and disadvantages of the implementations seen, and how they might inform the final design of the OMR AOT component.

Adaptation as a Service

Hamzeh Khazaei , University of Alberta ; Alireza Ghanbari , Khatam-ol-Anbia (PBU) University ; Marin Litoiu , York University ;

Current and emerging complex systems of many types including but not limited to big data systems, web-based systems, data centers and cloud infrastructure, social networks and the Internet of Things (IoT) have increasingly distributed and dynamic architecture that provide unprecedented flexibility in creating and supporting applications. However, such highly distributed architecture also increases the complexity of end-to-end management of such systems. Due to the sheer complexity, uncertainty and at the same time programmability of cloud environments, microservices and finally big data analytics, it is now required, and possible, to enable autonomic management in distributed systems in a dependable manner. In this paper, we argue that building autonomic management systems is a challenging task and requires its own set of expertise and knowledge. Therefore, in the light of current challenges, available enablers and recent successful stories, we propose the idea of moving from self-adaptation to ADaptation-as-a-Service (ADaaS).

All Timescale Window Co-occurrence

Yumeng Liu , University of Rochester ; Daniel Busaba , University of Rochester ; Chen Ding , University of Rochester ; Daniel Gildea , University of Rochester ;

Trace analysis is a common problem in system optimization and data analytics. This paper presents new efficient algorithms for window co-occurrence analysis, which is to find how likely two events will occur together in time windows of different lengths. The new solution requires a linear time preprocessing step, after which, it only takes logarithmic space and constant time to compute co-occurrence of a data pair in windows of any given length. One potential use of the new analysis is to reduce the asymptotic cost in affinity-based memory layout.

Challenges and solutions on architecting Blockchain Systems

Gregory Fournier , Polytechnique Montreal ; Fabio Petrillo , University of Quebec at Chicoutimi ;

Despite the fact that companies are gravitating more and more towards the use of blockchains in their systems, it is clear that the blockchains is no silver bullet. Many challenges such as scalability issues and frustrating trade-offs most notable in public decentralized blockchain systems are currently holding back blockchain's huge potential. In this paper we conduct a Systematic Literature Review in order to explore the current challenges of blockchain while presenting possible solutions to each of these challenges. We come to the conclusion that current challenges can be summarized in three categories: Scalability issues, security issues and a choice of consensus protocol. We also briefly discuss the use of blockchain in current systems, concluding that while blockchains current immaturity makes it hard to recommend for most projects, blockchains in their current state could be used in the Internet of Things.

Ischemic Brain Stroke Detection using EEG signals

Arooj Ahmed Qureshi , McMaster University ; Canxiu Zhang , McMaster University ; Rong Zheng , McMaster University ; Ahmed Elmeligi , HiNT ;

Stroke is the second leading cause of death in the United States of America. 87% of all strokes are ischemic stroke, which is mainly caused by the blockage of small blood vessels around the brain. Magnetic resonance imaging (MRI) provides the gold standard for accurate diagnosis of ischemic strokes, but it is both time-consuming and unsuitable for 24/7 monitoring. In this paper, we propose an ischemic stroke detection method through the multi-domain analysis of EEG brain signal from wearable EEG devices and machine learning. Using 40 healthy and 40 patients' data, we find that Multi- Layered Perceptron (MLP) and Bootstrap models (Extra-Tree and Decision-Tree) can achieve test accuracy of 95% with an area under the ROC curve 0.85.

Ontology Driven Temporal Event Annotator mHealth Application Framework

Amente Bekele , Carleton University ; Joe Samuel , Carleton University ; Shermeen Nizami , Carleton University ; Amna Basharat , National University of Computer & Emerging Sciences ; Randy Giffen , IBM Canada Ltd. ; James Green , Carleton University ;

We present an application (app) framework to facilitate the collection of gold standard temporal event annotations. These data will enable training and evaluation of machine learning algorithms for predicting events of clinical significance. Recording of such data using pen and paper can prove to be tedious and error-prone due to the variation in the types of events and the frequency of occurrence. To address this problem, we developed an mHealth application framework that presents an intuitive and configurable user interface for annotating a timeline with events. The presented Temporal Event Annotator (TEA) app framework supports dynamically building a customized application inclusive of events, event categories, and study attributes based on the design input of a specific study. This is accomplished by presenting a terminology schema for the hierarchical definition of event types and an additional user interface (UI) schema to support UI-specific attributes. We describe the framework architecture independent of specific technology implementations. We also describe specific instantiations of the framework that we used to develop and evaluate apps for three different use cases: 1) patient monitoring in the Neonatal Intensive Care Unit (NICU), 2) estimating patient stress levels during immersive rehabilitation therapy, and 3) quantifying the patient experience during emergency neonatal transport. The TEA framework provides a reliable and intuitive solution for temporal event annotation that accounts for the unique experimental requirements of each study.

Powering Software Sustainability with Blockchain

Omar Badreddin , University of Texas ;

Software sustainability is a systematic challenge that impacts broad segments of software systems. Software codebases must evolve overtime to address changing contexts and adapt to the flux in middlewares and platforms. In the process, it accumulates arbitrary complexities and its maintenance becomes progressively difficult. Current sustainability approaches focus on the symptoms and tend to be reactive in nature, and ignore the fundamental incentive structures that drive decision-making processes. Moreover, contemporary approaches are insensitive to the uniqueness of each software project context and operate on the assumption that sustainability measurements are universally applicable to the majority of software systems.This paper introduces a fundamentally novel peer-driven approach to managing software sustainability. The methodology ensures that software teams can define their own sustainability measures that adequately address the unique context of their project and its priorities. These measures are dynamically defined by the project peers to ensure their applicability as the project context evolves. Finally, the paper introduces Susereum, a blockchain platform that materializes the methodology and establishes novel incentive structures to systematically promote software sustainability throughout the project lifecycle.

Predictive Analytics in Healthcare: Epileptic Seizure Recognition

Ashok Bhowmick , Ryerson University ; Tamer Abdou , Arish University ; Ayse Bener , Ryerson University ;

Introduction Clinical applications of electroencephalography (EEG) span a very broad range of diagnostic conditions. Epileptic seizure is the fourth most common neurological disorder in that. Related Work There has been considerable progress in clinical understanding of epilepsy, however many aspects of seizure prevention are still a mystery. Predictive modeling of EEG can provide significant value addition to substantiate the diagnosis of epilepsy. Methodology Machine learning algorithms are applied to predict the probability of epileptic seizure using an open source multi-class dataset. Results and Discussion Comparing the F-score from different classifiers, it is found that XGBoost gives the best performance in binary classification and Random Forest provides the best performance in multinomial classification. Conclusion Our results show that it is possible to predict epileptic seizure with significant accuracy from non-epileptic parameters using a suitable machine learning algorithm. We also observe that binary classification methods have higher prediction accuracy.

Uncertainty Quantification-as-a-Service

Malgorzata Zimon , IBM ; Vadim Elisseev , IBM ; Robert Sawko , IBM ; Samuel Antão , IBM ; Kirk Jordan , IBM ;

Uncertainty quantification (UQ), which enables non-destructive virtual testing, is the fast growing area of modern computational science. UQ methods are computationally intensive and require construction of complex work-flows, which rely on a number of different software components often coming from different projects. Therefore, there is a need for developing a portable and scalable UQ pipeline that will enable efficient stochastic modelling. Our paper introduces a strategy for UQ as a Service using high performance computing and hybrid cloud infrastructures and presents its application to a heat transfer study in nuclear reactors simulation and modelling of tsunami events.

(Semi)Automatic Construction of Access-Controlled Web Data Services

Kalvin Eng , University of Alberta ; Diego Serrano , University of Alberta ; Eleni Stroulia , University of Alberta ; Jacob Jaremko , University of Alberta ;

The widespread adoption of the Internet of Things (IoT) is producing an ever-increasing stream of data that can be mined by multiple stakeholders, in support of different objectives and tasks. In fact, we are witnessing the emergence of data marketplaces that aim to share this data and harness economic value out of these transactions. The advent of data-as-a-service (DaaS) represents a key integrator opportunity that allows for the management of data collections, while providing specific privacy policies to delegated agents. To support DaaS integrations, we develop a model-driven method for creating APIs to deliver DaaS. Our method supports data owners to: (1) automatically abstract the representation of relational database schemas into a visual model and map them to existing ontologies, (2) use the mappings in order to create different role-based access-control views of APIs, and (3) automatically gen- erate API endpoints and their responses, based on these mappings. We develop a 'plug-and-play' prototype system for SQL databases to demonstrate this methodology and apply it to a use case of controlling data from a fitness monitoring application. Our aim is to enhance existing API creation methodologies that may be cumbersome by using semantics so that data can be easily shared and distributed.

A Case Study of Spark Resource Configuration and Management for Image Processing Applications

Dwight Makaroff , University of Saskatchewan ; Derek Eager , University of Saskatchewan ; Winfried Grassmann , University of Saskatchewan ; Habib Sabiu , University of Saskatchewan ; Owolabi Adekoya , University of Saskatchewan ;

The world population is expected to reach an estimated 9.8 billion by 2050, necessitating substantial increases in food production. Achieving such increases will require large-scale application of computer informatics within the agricultural sector. In particular, application of informatics to crop breeding has the potential to greatly enhance our ability to develop new varieties quickly and economically. Achieving this potential, however, will require capa- bilities for analyzing huge volumes of data acquired from various field-deployed image acquisition technologies. Although numerous frameworks for big data processing have been developed, there are relatively few published case studies that describe user experiences with these frameworks in particular application science domains. In this paper, we describe our efforts to apply Apache Spark to three applications of initial interest within the Plant Pheno- typing and Imaging Research Centre (P2IRC) at the University of Saskatchewan. We find that default Spark parameter settings do not work well for these applications. We carry out extensive perfor- mance experiments to investigate the impact of alternative Spark parameter settings, both for applications run individually and in scenarios with multiple concurrently executing applications. We find that optimizing Spark parameter settings is challenging, but can yield substantial performance improvements, particularly with concurrent applications, provided that the dataset characteristics are considered. This is a first step towards insights regarding Spark parameter tuning on these classes of applications that may be more generally applicable to broader ranges of applications.

A Competitive Platform for Continuous Programming Skill Enhancement

Jen-Hao Kuo , National Cheng Kung University ; Tsung-Han Wu , National Cheng Kung University ; Hong-Bao Ye , National Cheng Kung University ; Hewijin Christine Jiau , National Cheng Kung University ;

Enhancing programming skills is the key factor to keep up with current ever-changing technologies in IT industry. Implementing strategies on game-based platform is a common way for programmers to enhance programming skills. However, runtime simulation and game metrics provided by current game-based strategy platforms are ineffective to motivate continuous programming skill enhancement. We propose ELOP, a competitive game-based strategy platform to motivate programmers. ELOP will automatically schedule competitions for programmers, keep competition history, record changes in performance and provide needed personal in- formation for further enhancement. To evaluate the effectiveness of ELOP, we conduct several studies. The result shows that ELOP does motivate programmers in continuous programming practice and enhance their programming skills.

A Context-Aware Machine Learning-based Approach

Nathalia Nascimento , Pontificia Universidade Católica do Rio de Janeiro (PUC-Rio) ; Carlos Lucena , Pontificia Universidade Católica do Rio de Janeiro (PUC-Rio) ; Paulo Alencar , University of Waterloo ; Donald Cowan , University of Waterloo ;

It is known that training a general and versatile Machine Learning (ML)-based model is more cost-effective than training several specialized ML-models for different operating contexts. However, as the volume of training information grows, the higher the probability of producing biased results. Learning bias is a critical problem for many applications, such as those related to healthcare scenarios, environmental monitoring and air traffic control. In this paper, we compare the use of a general model that was trained using all contexts against a system that is composed of a set of specialized models that was trained for each particular operating context. For this purpose, we propose a local learning approach based on context-awareness, which involves: (i) anticipating, analyzing and representing context changes; (ii) training and finding machine learning models to maximize a given scoring function for each operating context; (iii) storing trained ML-based models and associating them with corresponding operating contexts; and (iv) deploying a system that is able to select the best-fit ML-based model at run- time based on the context. To illustrate our proposed approach, we reproduce two experiments: one that uses a neural network regression-based model to perform predictions and another one that uses an evolutionary neural network-based approach to make decisions. For each application, we compare the results of the general model, which was trained based on all contexts, against the results of our proposed approach. We show that our context-aware approach can improve results by alleviating bias with different ML tasks.

Assuring the runtime behavior of self-adaptive cyber-physical systems using feature modeling

Nayreet Islam , University of Ontario Institute of Technology ; Akramul Azim , University of Ontario Institute of Technology ;

A self-adaptive cyber-physical system (SACPS) can adjust its behavior and configurations at runtime in response to varying requirements obtained from the system and the environment. With the increasing use of the SACPS in different application domains, such variations are becoming more common. Users today expect the SACPS to guarantee its functional and timing behavior even in adverse environmental situations. However, uncertainties in the SACPS environment impose challenges on assuring the runtime behavior during system design. Software product line engineering (SPLE) is considered as a useful technique for handling varying requirements. In this paper, we present an approach for assuring the runtime behavior of the SACPS by applying an SPLE technique such as feature modeling. By representing the feature-based model at design time, we characterize the possible adaptation requirements to reusable configurations. The proposed approach aims to model two dynamic variability dimensions: 1) environment variability that describes the conditions un- der which the SACPS must adapt, and 2) structural variability, that defines the resulting architectural configurations. To validate our approach, the experimental analysis is performed using two case studies: 1) a traffic monitoring SACPS and 2) an automotive SACPS. We demonstrate that the proposed feature-based modeling approach can be used to achieve adaptivity which allows the SACPS to assure functional (defining execution of the correct set of adaptive tasks) and non-functional (defining execution of SACPS in the expected mode) correctness at runtime. The experimental results show that the feature-based SACPS demonstrates significant improvement in terms of self-configuration time, self-adaptation time and scalability with less probability of failure in different environmental situations.

Automating the Detection of Third-Party Java Library Migration At The Function Level

Hussein Alrubaye , Rochester Institute of Technology ; Mohamed Mkaouer , Rochester Institute of Technology ;

The process of migrating between different third-party libraries is very complex. Typically, developers need to find functions in the new library that are most adequate in replacing the functions of the retired library. This process is subjective and time-consuming as developers need to fully understand the documentation of both libraries' Application Programming Interfaces, and find the right match between their functions if it exists. In this context, several studies rely on mining existing library migrations to provide developers with by-example approaches for similar scenarios. In this paper, we propose a mining approach that extracts all the manuallyperformed function replacements for a given library migration. Our approach combines the mined function-change patterns with function-related lexical similarity to accurately detect mappings between replacing/replaced functions. Using our enhanced mining process, we perform a comparative study between state-of-art approaches for detecting migration traces at the function level. Our findings have shown its efficiency in accurately detecting migration fragments and it has enhanced the accuracy of state-of-art approaches in finding correct functions changes. We finally provide the community with a dataset of migrations between popular Java libraries, and their corresponding code changes at the function level.

Design and Implementation of Loss Mitigation in Spot Instances

Tasnim Kabir , Bangladesh University of Engineering and Technology ; A. B. M. Alim Al Islam , Bangladesh University of Engineering and Technology ;

Spot instances (as provided in Amazon Elastic Compute Cloud, EC2) offer resources at a reduced cost, however, often provide less reliability. This happens as the resources assigned to spot instances can be withdrawn abruptly due to real-time variations in demand and price. A specialized mechanism to deal with this offered reduced cost while maintaining high reliability is desirable to users, as users generally try to minimize their loss maintaining a minimal cost. Such a mechanism is little focused in the literature till now. Therefore, in this paper, we propose a mechanism to mitigate the loss of these spot instances in road to providing high reliability without increasing the computational cost. To do so, first, we apply checkpointing at different stages of the computation. Further, we propose several algorithms to place checkpoints at proper points during a computation. Our proposed algorithms can mitigate the loss by up to 99.9% on an average. We confirm these findings by experimenting over real Amazon spot instance history.

Detecting Communities in Social Networks Using Concept Interestingness

Mohamed-Hamza Ibrahim , University of Quebec in Outaouais ; Rokia Missaoui , UQO - LARIM ; Abir Messaoudi , UQO - LARIM ;

One key challenge in Social Network Analysis is to design an efficient and accurate community detection procedure as a means to discover intrinsic structures and extract relevant information. In this paper, we introduce a novel strategy called (COIN), which exploits COncept INterestingness measures to detect communities based on the concept lattice construction of the network. Thus, unlike off-the-shelf community detection algorithms, COIN leverages relevant conceptual characteristics inherited from Formal Concept Analysis to discover substantial local structures. On the first stage of COIN, we extract the formal concepts that capture all the cliques and bridges in the social network. On the second stage, we use the stability index to remove noisy bridges between communities and then percolate (merge) relevant adjacent cliques. Our experiments on several real-world social networks show that COIN can quickly detect communities more accurately than existing prominent algorithms such as Edge betweenness, Fast greedy modularity, and Infomap.

Does Learning Elm Graphics Programming Reinforce Math Knowledge?

John Zhang , McMaster University ; Anirudh Verma , McMaster University ; Chinmay Sheth , McMaster University ; Christopher Schankula , McMaster University ; Stephanie Koehl , McMaster University ; Andrew Kelly , Hamilton-Wentworth District School Board ; Yumna Irfan , McMaster University ; Christopher K. Anand , McMaster University ;

Many important workloads depend on the efficient computation of elementary functions like square root and logarithm. Accurate computation of these functions is time-consuming, and hard for compilers to schedule, because of conditional execution. These problems are exacerbated by SIMD computation, which does not mix well with conditional execution. Previously, we have outlined how performance can be improved by encapsulating the conditional execution in new instructions. In this paper, we refine this approach to take into account testability, the ability for code to be pipelined, and exploitation of processors with a gather-load instruction. In particular, we look at the decomposition of the previously described instruction pairs into three instructions. The instructions can incorporate table lookups, or complement existing load instructions. The variant which complements existing load instructions is expected to perform as well as the other variants, and is easier to test and to pipeline. This paper presents gate-level details for the instructions required to calculate various logarithm functions, including the circuit depth, count and approximate width. In addition, we highlight the relative complexity of verifying these instructions relative to other known instructions, and outline our strategy for light-weight verification. Finally, we show that this strategy would be expected to produce a doubling of performance on a wide class of processors, using an IBM POWER processor as an example.

Empirical Vulnerability Analysis of Automated Smart Contracts Security Testing on Blockchains

Reza Parizi , Kennesaw State University ; Ali Dehghantanha , University of Guelph ; Kim Kwang Raymond Choo , University of Texas at San Antonio ; Amritraj Singh , Kennesaw State University ;

The emerging blockchain technology supports decentralized computing paradigm shift and is a rapidly approaching phenomenon. While blockchain is thought primarily as the basis of Bitcoin, its application has grown far beyond cryptocurrencies due to the introduction of smart contracts. Smart contracts are self-enforcing pieces of software, which reside and run over a hosting blockchain. Using blockchain-based smart contracts for secure and transparent management to govern interactions (authentication, connection, and transaction) in Internet-enabled environments, mostly IoT, is a niche area of research and practice. However, writing trustworthy and safe smart contracts can be tremendously challenging because of the complicated semantics of underlying domain-specific languages and its testability. There have been high-profile incidents that indicate blockchain smart contracts could contain various code-security vulnerabilities, instigating financial harms. When it involves security of smart contracts, developers embracing the ability to write the contracts should be capable of testing their code, for diagnosing security vulnerabilities, before deploying them to the immutable environments on blockchains. However, there are only a handful of security testing tools for smart contracts. This implies that the existing research on automatic smart contracts security testing is not adequate and remains in a very stage of infancy. With a specific goal to more readily realize the application of blockchain smart contracts in security and privacy, we should first understand their vulnerabilities before widespread implementation. Accordingly, the goal of this paper is to carry out a far-reaching experimental assessment of current static smart contracts security testing tools, for the most widely used blockchain, the Ethereum and its domain-specific programming language, Solidity, to provide the first body of knowledge for creating more secure blockchain-based software.

Evaluating Efficiency, Effectiveness and Satisfaction of AWS and Azure from the Perspective of Cloud Beginners

Gabriel Costa Silva , Universidade Tecnológica Federal do Paraná ; Reginaldo Ré , Universidade Tecnológica Federal do Paraná ; Marco Aurélio Graciotto Silva , Universidade Tecnológica Federal do Paraná ;

Quality has long been regarded as an important driver of cloud adoption. In particular, quality in use (QiU) of cloud platforms may drive cloud beginners to the cloud platform that offers the best cloud experience. Cloud beginners are critical to the cloud market because they currently represent nearly a third of cloud users. We carried out three experiments to measure the QiU (dependent variable) of public cloud platforms (independent variable) regarding efficiency, effectiveness and satisfaction. AWS EC2 and Azure Virtual Machines are the two cloud services used as representative proxies to evaluate cloud platforms (treatments). Eleven undergraduate students with limited cloud knowledge (participants) manually created 152 VMs (task) using the web interface of cloud platforms (instrument) following seven different configurations (trials) for each cloud platform. Whereas AWS performed significantly better than Azure for efficiency (p-value not exceeding 0.001, A-statistic = 0.68), we could not find a significant difference between platforms for effectiveness (p-value exceeding 0.05) although the effect size was found relevant (odds ratio = 0.41). Regarding satisfaction, most of our participants perceived the AWS as (i) having the best GUI to benefiting user interaction, (ii) the easiest platform to use, and (iii) the preferred cloud platform for creating VMs. Once confirmed by independent replications, our results suggest that AWS outperforms Azure regarding QiU. Therefore, cloud beginners might have a better cloud experience starting off their cloud projects by using AWS rather than Azure. In addition, our results may help to explain the AWS's cloud leadership.

Evaluating Music Mastering Quality Using Machine Learning

Mark Shtern , York University ; Pedro Casas , York University ; Vassilios Tzerpos , York University ;

Machine learning has been applied in a vast array of applications in the recent years, including several qualitative problems in the arts. However, in the world of music production, including mixing and mastering, most tasks are still performed by music professionals with decades of experience. Aspiring mastering engineers typically have to apprentice with professionals to learn their craft. Access to professionals is a scarce resource though, as they are typically very busy. In this paper, we present a method to evaluate the mastering quality of a piece of music automatically. We delegate the task of determining what we deem to be a subjectively well mastered song to professional mastering engineers. Using professionally mastered music, we derive datasets with varying degrees of deviation from the original music and train models to recognize the changes that have been made. This allows us to provide novice mastering engineers with an automatic rating of their work based on the magnitude of the deviation from the gold standard. We present experiments that demonstrate the accuracy of our approach, as well as a user study that shows how the results of our approach correlate to assessments made by human evaluators.

Feasibility of Internal Object Pools for Reduced Memory Management

Konstantin Nasartschuk , University of New Brunswick ; Kenneth Kent , University of New Brunswick ; Stephen MacKay , University of New Brunswick ; Aleksander Micic , IBM Canada Ltd. ;

Object pools is a widely used software engineering pattern used to reuse object instances without the need of repeated allocation and instantiation. While the benefits of using object pool structures are still present when used in a garbage collected environment, it adds a memory management component to the development process. The paper investigates the feasibility of introducing automatically created and maintained object pools for predefined classes. Automatic object pools are implemented and discussed using the GenCon GC and Balanced GC policies.

Feature engineering in Big Data for detection of information systems misuse

Eduardo Lopez , McMaster University ; Kamran Sartipi , East Carolina University ;

The increasing availability of very large volumes of digital data (i.e. Big Data) enables many interesting research streams on a wide variety of phenomena. However, there has been a paucity of Big Data sets in the area of cybersecurity in information systems, as organizations are reluctant to share data that may provide too much unrestricted visibility into their operations. In this study, we explore the use of a real-life, anonymized, very large dataset containing user behavior as captured in log files including both regular usage as well as misuse, typifying the dynamics found in a situation with compromised user credentials. Through the experiment, we validate that the existence of a large user behavior dataset in itself does not necessarily guarantee that abnormal behaviors can be found. It is essential that researchers apply deep domain knowledge, critical thinking and practical focus to ensure the data can produce the knowledge required for the ultimate objective of detecting an insider's threat. In this paper we develop, formulate and calculate the features that best represent user behavior in the underlying information systems, maintaining a parsimonious balance between complexity, resource demands and detection effectiveness. We test the use of a classification model that proves the usefulness and aplicability of the features extracted.

Hardware/Software CoDesign for Mathematical Function Acceleration

Christopher Anand , McMaster University ; Lucas Dutton , McMaster University ; Adele Olejarz , McMaster University ; Robert Enenkel , IBM Canada Ltd. ; Wolfram Kahl , McMaster University ;

Many important workloads depend on the efficient computation of elementary functions like square root and logarithm. Accurate computation of these functions is time-consuming, and hard for compilers to schedule, because of conditional execution. These problems are exacerbated by SIMD computation, which does not mix well with conditional execution. Previously, we have outlined how performance can be improved by encapsulating the conditional execution in new instructions. In this paper, we refine this approach to take into account testability, the ability for code to be pipelined, and exploitation of processors with a gather-load instruction. In particular, we look at the decomposition of the previously described instruction pairs into three instructions. The instructions can incorporate table lookups, or complement existing load instructions. The variant which complements existing load instructions is expected to perform as well as the other variants, and is easier to test and to pipeline. This paper presents gate-level details for the instructions required to calculate various logarithm functions, including the circuit depth, count and approximate width. In addition, we highlight the relative complexity of verifying these instructions relative to other known instructions, and outline our strategy for light-weight verification. Finally, we show that this strategy would be expected to produce a doubling of performance on a wide class of processors, using an IBM POWER processor as an example.

Just-in-time Detection of Protection-Impacting Changes on WordPress and MediaWiki

Amine Barrak , Polytechnique Montréal ; Marc-André Laverdière , Polytechnique Montréal ; Foutse Khomh , Polytechnique Montréal ; Le An , Polytechnique Montréal ; Ettore Merlo , Polytechnique Montréal ;

Access control mechanisms based on roles and privileges restrict the access of users to security sensitive resources in a multi-user software system. Unintentional privilege protection changes may occur during the evolution of a system, which may introduce security vulnerabilities; threatening user's confidential data, and causing other severe problems. In this paper, we use the Pattern Traversal Flow Analysis technique to identify definite protection differences in WordPress and MediaWiki systems. We analyse the evolution of privilege protections across 211 and 193 releases from respectively WordPress and Mediawiki, and observe that around 60% of commits affect privileges protections in both projects. We refer to these commits as protection-impacting change (PIC) commits. To help developers identify PIC commits just-in-time, we extract a series of metrics from commit logs and source code, and build statistical models. The evaluation of these models revealed that they can achieve a precision up to 73.8% and a recall up to 98.8% in WordPress and for MediaWiki, a precision up to 77.2% and recall up to 97.8%. Among the metrics examined, commit churn, bug fixing, author experiences and code complexity between two releases are the most important predictors in the models. We performed a qualitative analysis of false positives and false negatives and observe that PIC commits detectors should ignore documentation-only commits and process code changes without the comments.

Natural Language Question Answering in the Financial Domain

John Boyer , IBM Canada Ltd. ;

This paper describes a natural language question answering system focused on answering financial domain questions using a daily updated corpus of financial reports. Financial entity types of interest included company stocks, country bonds, currencies, industries, commodities, and diversified assets. Financial questions of interest included explanatory and factual questions about entities as well as financial outlook for entities. An important architectural divergence emerged between the approach required for answering financial outlook questions versus the approach for answering other financial information questions. The financial domain focus also introduced additional challenges to open domain natural language processing that were addressed in the areas of document ingestion, question classification accuracy, question analysis techniques, speed of machine learning, answer ranking by linguistic confidence versus temporality, and system accuracy assessment.

Node.js Scalability Investigation in the Cloud

Jiapeng Zhu , University of New Brunswick ; Panagiotis Patros , University of Waikato, University of New Brunswick ; Kenneth Kent , University of New Brunswick ; Michael Dawson , IBM Canada Ltd. ;

Node.js has gained popularity in cloud development due to its asynchronous, non-blocking and event-driven nature. However, scalability issues can limit the number of concurrent requests while achieving an acceptable level of performance. To the best of our knowledge, no cloud-based benchmarks or metrics focusing on Node.js scalability exist. This paper presents the design and implementation of Ibenchjs, a scalability-oriented benchmarking framework, and a set of sample test applications. We deploy Ibenchjs in a local and isolated cloud to collect and report scalability-related measurements and issues of Node.js as well as performance bottlenecks. Our findings include: 1) the scaling performance of the tested Node.js test applications was sub-linear; 2) no improvements were measured when more CPUs were added without modifying the number of Node.js instances; and 3) leveraging cloud scaling solutions significantly outperformed Node.js-module-based scaling.

Persistent Memory Storage of Cold Regions in the OpenJ9 Java Virtual Machine

Scott Young , University of New Brunswick ; Michael Flawn , University of New Brunswick ; Kenneth Kent , University of New Brunswick ; Gerhard Dueck , University of New Brunswick ; Charlie Gracie , IBM Canada Ltd. ;

In this paper an optimization technique for object-oriented language runtimes with automatic memory management is investigated. The technique involves segregating objects into different memory areas, backed by different memory devices, on a per object basis. This technique is compared to operating system paging mechanisms with swap partitions. Two different schemes for determining which objects should be segregated into slower memory are tested and their results are discussed. It has been observed that each technique can be the most or least optimal choice depending on the application.

Reducing Variability of Technically Related Software Systems in Large-Scale IT Landscapes

Kenny Wehling , Volkswagen AG ; David Wille , Technische Universität Braunschweig ; Christoph Seidl , Technische Universität Braunschweig ; Ina Schaefer , Technische Universität Braunschweig ;

The number of software systems in a company typically grows with the business requirements. Therefore, IT landscapes in large companies can consist of hundreds or thousands of different software systems. As the evolution of such large-scale landscapes is often uncoordinated, they commonly comprise different groups of related software systems using a common core technology (e.g., Java Web-Application) implemented by a variety of architectural components (e.g., different application servers or databases). This leads to increased costs and higher effort for maintaining and evolving these software systems and the entire IT landscape. To alleviate these problems, the variability of such technically related software systems has to be reduced. For this purpose, experts have to assess and evaluate restructuring potentials in order to take appropriate restructuring decisions. As a manual analysis requires high effort and is not feasible for large-scale IT landscapes, experts face a major challenge. To overcome this challenge, we introduce a novel approach to automatically support experts in taking reasonable restructuring decisions. By providing automated methods for assessing, evaluating and simulating restructuring potentials, experts are capable of reducing the variability of related software systems in large-scale IT landscapes. We show suitability of our approach by expert interviews and an industrial case study with architectures of real-world software systems.

Scalable Practical Byzantine Fault Tolerance with Short-Lived Signature Schemes

Xinxin Fan , IoTeX ;

The Practical Byzantine Fault Tolerance (PBFT) algorithm is a popular solution for establishing consensus in blockchain systems. The execution time of the PBFT consensus algorithm has an important effect on the blockchain throughput. Digital signatures are extensively used in PBFT to ensure the authenticity of messages during the different phases. Due to the round-based and broadcast natures of PBFT, nodes need to verify multiple signatures received from their peers, which incurs significant computational overhead and slows down the consensus process. To address this issue, we propose an efficient short-lived signature based PBFT variant, which utilizes short-length cryptographic keys to sign/verify messages in PBFT for a short period of time and blockchain-aided key distribution mechanisms to update those keys periodically. We also present efficient algorithms for accelerating the software implementation of the BLS threshold signature scheme. Our extensive experiments with three elliptic curves and two signature schemes demonstrate the efficacy of using short-lived signature schemes for improving the scalability of PBFT significantly.

The Impact of Design and UML Modeling on Codebase Quality and Sustainability

Omar Badreddin , University of Texas ; Khandoker Rahad , University of Texas ;

The general consensus of researchers and practitioners is that upfront and continuous software design using modeling languages such as UML improve code quality and reliability particularly as the software evolves over time. Software designs and models help in managing the underlying code complexities which are crucial for sustainability. Recently, there has been increasing evidence suggesting broader adoption of modeling languages such as UML. However, our understanding of the impact of using such modeling and design languages remains limited. This paper reports on a study that aims to characterize this impact on code quality and sustainability. We identify a sample of open source software repositories with extensive use of designs and modeling and compare their code qualities with similar code-centric repositories. Our evaluation focuses on various code quality attributes such as code smells and technical debt. We also conduct code evolution analysis over fiveyear period and collect additional data from questionnaires and interviews with active repository contributors. This study finds that repositories with significant use of models and design activities are associated with reduced critical code smells but are also associated with increase in non-critical code smells. The study also finds that modeling and design activities are associated with significant reduction in measures of technical debt. Analyzing code evolution over five year period reveals that UML repositories start with significantly lower technical debt density measures but tend to decline over time.

UML-Driven Automated Software Deployment

Luis F. Rivera , Universidad Icesi, University of Victoria ; Norha M. Villegas , Universidad Icesi ; Gabriel Tamura , Universidad Icesi ; Miguel Jiménez , University of Victoria ; Hausi A. Müller , University of Victoria ;

Software companies face the challenge of ensuring customer satisfaction through the continuous delivery of functionalities and rapid response to quality issues. However, achieving frequent software delivery is not a trivial task. It requires agile and continuous design, development and deployment of existing and new software features. Over time, managing these systems becomes increasingly complex. This complexity stems, in part, from the deployment pipelines and the myriad possible configurations of the software components. Furthermore, software deployment is a timeconsuming and error-prone process, which, even when automated, can lead to configuration errors and cost overruns. In this paper, we address deployment challenges that developers face during continuous delivery and DevOps. Our proposal consists of Urano, a mechanism for automating the deployment process, which uses UML, an interoperable and de facto modeling standard, as a means of specifying a software architecture and its associated deployment. Our approach is based on the model-driven architecture principles to generate executable deployment specifications from user-defined UML deployment diagrams. We extend this kind of diagrams by defining and applying a UML profile that captures the semantics and requirements of the installation, configuration, and update of software components. Thus, enabling more expressive deployment specifications and their automatic realization. To evaluate Urano, we conducted three case studies that demonstrate its potential to effectively automate software deployment processes in industry.