arXiv Paper Daily: Thu, 23 Jan 2020

Neural and Evolutionary Computing

Learning Directed Locomotion in Modular Robots with Evolvable Morphologies

Gongjin Lan , Matteo De Carlo , Fuda van Diggelen , Jakub M. Tomczak , Diederik M. Roijers , A.E. Eiben

Comments: 30 pages, 14 figures

Subjects

:

Neural and Evolutionary Computing (cs.NE)

; Artificial Intelligence (cs.AI)

We generalize the well-studied problem of gait learning in modular robots in

two dimensions. Firstly, we address locomotion in a given target direction that

goes beyond learning a typical undirected gait. Secondly, rather than studying

one fixed robot morphology we consider a test suite of different modular

robots. This study is based on our interest in evolutionary robot systems where

both morphologies and controllers evolve. In such a system, newborn robots have

to learn to control their own body that is a random combination of the bodies

of the parents. We apply and compare two learning algorithms, Bayesian

optimization and HyperNEAT. The results of the experiments in simulation show

that both methods successfully learn good controllers, but Bayesian

optimization is more effective and efficient. We validate the best learned

controllers by constructing three robots from the test suite in the real world

and observe their fitness and actual trajectories. The obtained results

indicate a reality gap that depends on the controllers and the shape of the

robots, but overall the trajectories are adequate and follow the target

directions successfully.

Automatic phantom test pattern classification through transfer learning with deep neural networks

Rafael B. Fricks , Justin Solomon , Ehsan Samei Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Medical Physics (physics.med-ph)

Imaging phantoms are test patterns used to measure image quality in computer

tomography (CT) systems. A new phantom platform (Mercury Phantom, Gammex)

provides test patterns for estimating the task transfer function (TTF) or noise

power spectrum (NPF) and simulates different patient sizes. Determining which

image slices are suitable for analysis currently requires manual annotation of

these patterns by an expert, as subtle defects may make an image unsuitable for

measurement. We propose a method of automatically classifying these test

patterns in a series of phantom images using deep learning techniques. By

adapting a convolutional neural network based on the VGG19 architecture with

weights trained on ImageNet, we use transfer learning to produce a classifier

for this domain. The classifier is trained and evaluated with over 3,500

phantom images acquired at a university medical center. Input channels for

color images are successfully adapted to convey contextual information for

phantom images. A series of ablation studies are employed to verify design

aspects of the classifier and evaluate its performance under varying training

conditions. Our solution makes extensive use of image augmentation to produce a

classifier that accurately classifies typical phantom images with 98% accuracy,

while maintaining as much as 86% accuracy when the phantom is improperly

imaged.

Accelerating supply chains with Ant Colony Optimization across range of hardware solutions

Ivars Dzalbs , Tatiana Kalganova Subjects : Artificial Intelligence (cs.AI) ; Distributed, Parallel, and Cluster Computing (cs.DC); Neural and Evolutionary Computing (cs.NE)

Ant Colony algorithm has been applied to various optimization problems,

however most of the previous work on scaling and parallelism focuses on

Travelling Salesman Problems (TSPs). Although, useful for benchmarks and new

idea comparison, the algorithmic dynamics does not always transfer to complex

real-life problems, where additional meta-data is required during solution

construction. This paper looks at real-life outbound supply chain problem using

Ant Colony Optimization (ACO) and its scaling dynamics with two parallel ACO

architectures – Independent Ant Colonies (IAC) and Parallel Ants (PA). Results

showed that PA was able to reach a higher solution quality in fewer iterations

as the number of parallel instances increased. Furthermore, speed performance

was measured across three different hardware solutions – 16 core CPU, 68 core

Xeon Phi and up to 4 Geforce GPUs. State of the art, ACO vectorization

techniques such as SS-Roulette were implemented using C++ and CUDA. Although

excellent for TSP, it was concluded that for the given supply chain problem

GPUs are not suitable due to meta-data access footprint required. Furthermore,

compared to their sequential counterpart, vectorized CPU AVX2 implementation

achieved 25.4x speedup on CPU while Xeon Phi with its AVX512 instruction set

reached 148x on PA with Vectorized (PAwV). PAwV is therefore able to scale at

least up to 1024 parallel instances on the supply chain network problem solved.

Get Rid of Suspended Animation Problem: Deep Diffusive Neural Network on Graph Semi-Supervised Classification

Jiawei Zhang

Comments: 7 pages, 6 figures

Subjects

:

Machine Learning (cs.LG)

; Artificial Intelligence (cs.AI); Information Theory (cs.IT); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

Existing graph neural networks may suffer from the “suspended animation

problem” when the model architecture goes deep. Meanwhile, for some graph

learning scenarios, e.g., nodes with text/image attributes or graphs with

long-distance node correlations, deep graph neural networks will be necessary

for effective graph representation learning. In this paper, we propose a new

graph neural network, namely DIFNET (Graph Diffusive Neural Network), for graph

representation learning and node classification. DIFNET utilizes both neural

gates and graph residual learning for node hidden state modeling, and includes

an attention mechanism for node neighborhood information diffusion. Extensive

experiments will be done in this paper to compare DIFNET against several

state-of-the-art graph neural network models. The experimental results can

illustrate both the learning performance advantages and effectiveness of

DIFNET, especially in addressing the “suspended animation problem”.

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices

Xiaolong Ma , Wei Niu , Tianyun Zhang , Sijia Liu , Fu-Ming Guo , Sheng Lin , Hongjia Li , Xiang Chen , Jian Tang , Kaisheng Ma , Bin Ren , Yanzhi Wang

Comments: arXiv admin note: text overlap with arXiv:1909.05073

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Weight pruning has been widely acknowledged as a straightforward and

effective method to eliminate redundancy in Deep Neural Networks (DNN), thereby

achieving acceleration on various platforms. However, most of the pruning

techniques are essentially trade-offs between model accuracy and regularity

which lead to impaired inference accuracy and limited on-device acceleration

performance. To solve the problem, we introduce a new sparsity dimension,

namely pattern-based sparsity that comprises pattern and connectivity sparsity,

and becoming both highly accurate and hardware friendly. With carefully

designed patterns, the proposed pruning unprecedentedly and consistently

achieves accuracy enhancement and better feature extraction ability on

different DNN structures and datasets, and our pattern-aware pruning framework

also achieves pattern library extraction, pattern selection, pattern and

connectivity pruning and weight training simultaneously. Our approach on the

new pattern-based sparsity naturally fits into compiler optimization for highly

efficient DNN execution on mobile platforms. To the best of our knowledge, it

is the first time that mobile devices achieve real-time inference for the

large-scale DNN models thanks to the unique spatial property of pattern-based

sparsity and the help of the code generation capability of compilers.

Computer Vision and Pattern Recognition

RDAnet: A Deep Learning Based Approach for Synthetic Aperture Radar Image Formation

Andrew Rittenbach (1), John Paul Walters (1) ((1) University of Southern California Information Sciences Institute, Arlington VA)

Comments: 8 pages, 5 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Synthetic Aperture Radar (SAR) imaging systems operate by emitting radar

signals from a moving object, such as a satellite, towards the target of

interest. Reflected radar echoes are received and later used by image formation

algorithms to form a SAR image. There is great interest in using SAR images in

computer vision tasks such as automatic target recognition. Today, however, SAR

applications consist of multiple operations: image formation followed by image

processing. In this work, we show that deep learning can be used to train a

neural network able to form SAR images from echo data. Results show that our

neural network, RDAnet, can form SAR images comparable to images formed using a

traditional algorithm. This approach opens the possibility to end-to-end SAR

applications where image formation and image processing are integrated into a

single task. We believe that this work is the first demonstration of deep

learning based SAR image formation using real data.

Automatic phantom test pattern classification through transfer learning with deep neural networks

Rafael B. Fricks , Justin Solomon , Ehsan Samei Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Medical Physics (physics.med-ph)

Imaging phantoms are test patterns used to measure image quality in computer

tomography (CT) systems. A new phantom platform (Mercury Phantom, Gammex)

provides test patterns for estimating the task transfer function (TTF) or noise

power spectrum (NPF) and simulates different patient sizes. Determining which

image slices are suitable for analysis currently requires manual annotation of

these patterns by an expert, as subtle defects may make an image unsuitable for

measurement. We propose a method of automatically classifying these test

patterns in a series of phantom images using deep learning techniques. By

adapting a convolutional neural network based on the VGG19 architecture with

weights trained on ImageNet, we use transfer learning to produce a classifier

for this domain. The classifier is trained and evaluated with over 3,500

phantom images acquired at a university medical center. Input channels for

color images are successfully adapted to convey contextual information for

phantom images. A series of ablation studies are employed to verify design

aspects of the classifier and evaluate its performance under varying training

conditions. Our solution makes extensive use of image augmentation to produce a

classifier that accurately classifies typical phantom images with 98% accuracy,

while maintaining as much as 86% accuracy when the phantom is improperly

imaged.

Discovering Salient Anatomical Landmarks by Predicting Human Gaze

Richard Droste , Pierre Chatelain , Lior Drukker , Harshita Sharma , Aris T. Papageorghiou , J. Alison Noble

Comments: Accepted at IEEE International Symposium on Biomedical Imaging 2020 (ISBI 2020)

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Anatomical landmarks are a crucial prerequisite for many medical imaging

tasks. Usually, the set of landmarks for a given task is predefined by experts.

The landmark locations for a given image are then annotated manually or via

machine learning methods trained on manual annotations. In this paper, in

contrast, we present a method to automatically discover and localize anatomical

landmarks in medical images. Specifically, we consider landmarks that attract

the visual attention of humans, which we term visually salient landmarks. We

illustrate the method for fetal neurosonographic images. First, full-length

clinical fetal ultrasound scans are recorded with live sonographer

gaze-tracking. Next, a convolutional neural network (CNN) is trained to predict

the gaze point distribution (saliency map) of the sonographers on scan video

frames. The CNN is then used to predict saliency maps of unseen fetal

neurosonographic images, and the landmarks are extracted as the local maxima of

these saliency maps. Finally, the landmarks are matched across images by

clustering the landmark CNN features. We show that the discovered landmarks can

be used within affine image registration, with average landmark alignment

errors between 4.1% and 10.9% of the fetal head long axis length.

Causality based Feature Fusion for Brain NeuroDevelopmental Analysis

Peyman Hosseinzadeh Kassani , Li Xiao , Gemeng Zhang , Julia M. Stephen , Tony W. Wilson , Vince D. Calhoun , Yu Ping Wang

Comments: 10 pages

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI)

Human brain development is a complex and dynamic process that is affected by

several factors such as genetics, sex hormones, and environmental changes. A

number of recent studies on brain development have examined functional

connectivity (FC) defined by the temporal correlation between time series of

different brain regions. We propose to add the directional flow of information

during brain maturation. To do so, we extract effective connectivity (EC)

through Granger causality (GC) for two different groups of subjects, i.e.,

children and young adults. The motivation is that the inclusion of causal

interaction may further discriminate brain connections between two age groups

and help to discover new connections between brain regions. The contributions

of this study are threefold. First, there has been a lack of attention to

EC-based feature extraction in the context of brain development. To this end,

we propose a new kernel-based GC (KGC) method to learn nonlinearity of complex

brain network, where a reduced Sine hyperbolic polynomial (RSP) neural network

was used as our proposed learner. Second, we used causality values as the

weight for the directional connectivity between brain regions. Our findings

indicated that the strength of connections was significantly higher in young

adults relative to children. In addition, our new EC-based feature outperformed

FC-based analysis from Philadelphia neurocohort (PNC) study with better

discrimination of the different age groups. Moreover, the fusion of these two

sets of features (FC + EC) improved brain age prediction accuracy by more than

4%, indicating that they should be used together for brain development studies.

Are Accelerometers for Activity Recognition a Dead-end?

Catherine Tong , Shyam A. Tailor , Nicholas D. Lane Subjects : Computer Vision and Pattern Recognition (cs.CV)

Accelerometer-based (and by extension other inertial sensors) research for

Human Activity Recognition (HAR) is a dead-end. This sensor does not offer

enough information for us to progress in the core domain of HAR—to recognize

everyday activities from sensor data. Despite continued and prolonged efforts

in improving feature engineering and machine learning models, the activities

that we can recognize reliably have only expanded slightly and many of the same

flaws of early models are still present today. Instead of relying on

acceleration data, we should instead consider modalities with much richer

information—a logical choice are images. With the rapid advance in image

sensing hardware and modelling techniques, we believe that a widespread

adoption of image sensors will open many opportunities for accurate and robust

inference across a wide spectrum of human activities.

In this paper, we make the case for imagers in place of accelerometers as the

default sensor for human activity recognition. Our review of past works has led

to the observation that progress in HAR had stalled, caused by our reliance on

accelerometers. We further argue for the suitability of images for activity

recognition by illustrating their richness of information and the marked

progress in computer vision. Through a feasibility analysis, we find that

deploying imagers and CNNs on device poses no substantial burden on modern

mobile hardware. Overall, our work highlights the need to move away from

accelerometers and calls for further exploration of using imagers for activity

recognition.

Learning to Correct 3D Reconstructions from Multiple Views

Ştefan Săftescu , Paul Newman Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Robotics (cs.RO)

This paper is about reducing the cost of building good large-scale 3D

reconstructions post-hoc. We render 2D views of an existing reconstruction and

train a convolutional neural network (CNN) that refines inverse-depth to match

a higher-quality reconstruction. Since the views that we correct are rendered

from the same reconstruction, they share the same geometry, so overlapping

views complement each other. We take advantage of that in two ways. Firstly, we

impose a loss during training which guides predictions on neighbouring views to

have the same geometry and has been shown to improve performance. Secondly, in

contrast to previous work, which corrects each view independently, we also make

predictions on sets of neighbouring views jointly. This is achieved by warping

feature maps between views and thus bypassing memory-intensive 3D computation.

We make the observation that features in the feature maps are

viewpoint-dependent, and propose a method for transforming features with

dynamic filters generated by a multi-layer perceptron from the relative poses

between views. In our experiments we show that this last step is necessary for

successfully fusing feature maps between views.

UniPose: Unified Human Pose Estimation in Single Images and Videos

Bruno Artacho , Andreas Savakis Subjects : Computer Vision and Pattern Recognition (cs.CV)

We propose UniPose, a unified framework for human pose estimation, based on

our “Waterfall” Atrous Spatial Pooling architecture, that achieves

state-of-art-results on several pose estimation metrics. Current pose

estimation methods utilizing standard CNN architectures heavily rely on

statistical postprocessing or predefined anchor poses for joint localization.

UniPose incorporates contextual segmentation and joint localization to estimate

the human pose in a single stage, with high accuracy, without relying on

statistical postprocessing methods. The Waterfall module in UniPose leverages

the efficiency of progressive filtering in the cascade architecture, while

maintaining multi-scale fields-of-view comparable to spatial pyramid

configurations. Additionally, our method is extended to UniPose-LSTM for

multi-frame processing and achieves state-of-the-art results for temporal pose

estimation in Video. Our results on multiple datasets demonstrate that UniPose,

with a ResNet backbone and Waterfall module, is a robust and efficient

architecture for pose estimation obtaining state-of-the-art results in single

person pose detection for both single images and videos.

Depthwise Non-local Module for Fast Salient Object Detection Using a Single Thread

Haofeng Li , Guanbin Li , Binbin Yang , Guanqi Chen , Liang Lin , Yizhou Yu

Comments: Accepted as a regular paper in the IEEE Transactions on Cybernetics

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Recently deep convolutional neural networks have achieved significant success

in salient object detection. However, existing state-of-the-art methods require

high-end GPUs to achieve real-time performance, which makes them hard to adapt

to low-cost or portable devices. Although generic network architectures have

been proposed to speed up inference on mobile devices, they are tailored to the

task of image classification or semantic segmentation, and struggle to capture

intra-channel and inter-channel correlations that are essential for contrast

modeling in salient object detection. Motivated by the above observations, we

design a new deep learning algorithm for fast salient object detection. The

proposed algorithm for the first time achieves competitive accuracy and high

inference efficiency simultaneously with a single CPU thread. Specifically, we

propose a novel depthwise non-local moudule (DNL), which implicitly models

contrast via harvesting intra-channel and inter-channel correlations in a

self-attention manner. In addition, we introduce a depthwise non-local network

architecture that incorporates both depthwise non-local modules and inverted

residual blocks. Experimental results show that our proposed network attains

very competitive accuracy on a wide range of salient object detection datasets

while achieving state-of-the-art efficiency among all existing deep learning

based algorithms.

Attention! A Lightweight 2D Hand Pose Estimation Approach

Nicholas Santavas , Ioannis Kansizoglou , Loukas Bampis , Evangelos Karakasis , Antonios Gasteratos

Comments: submitted to IEEE Signal Processing Letters

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)

Vision based human pose estimation is an non-invasive technology for

Human-Computer Interaction (HCI). Direct use of the hand as an input device

provides an attractive interaction method, with no need for specialized sensing

equipment, such as exoskeletons, gloves etc, but a camera. Traditionally, HCI

is employed in various applications spreading in areas including manufacturing,

surgery, entertainment industry and architecture, to mention a few. Deployment

of vision based human pose estimation algorithms can give a breath of

innovation to these applications. In this letter, we present a novel

Convolutional Neural Network architecture, reinforced with a Self-Attention

module that it can be deployed on an embedded system, due to its lightweight

nature, with just 1.9 Million parameters. The source code and qualitative

results are publicly available.

ResDepth: Learned Residual Stereo Reconstruction

Corinne Stucker , Konrad Schindler Subjects : Computer Vision and Pattern Recognition (cs.CV)

We propose an embarrassingly simple, but very effective scheme for

high-quality dense stereo reconstruction: (i) generate an approximate

reconstruction with your favourite stereo matcher; (ii) rewarp the input images

with that approximate model; and (iii) with the initial reconstruction and the

warped images as input, train a deep network to enhance the reconstruction by

regressing a residual correction. The strategy to only learn the residual

greatly simplifies the learning problem. A standard Unet without bells and

whistles is enough to reconstruct even small surface details, like dormers and

roof substructures in satellite images. We also investigate residual

reconstruction with less information and find that even a single image is

enough to greatly improve an approximate reconstruction. Our full model reduces

the mean absolute error of state-of-the-art stereo reconstruction systems by

>50%, both in our target domain of satellite stereo and on stereo pairs from

the ETH3D benchmark.

ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data

Di Qi , Lin Su , Jia Song , Edward Cui , Taroon Bharti , Arun Sachet Subjects : Computer Vision and Pattern Recognition (cs.CV)

In this paper, we introduce a new vision-language pre-trained model —

ImageBERT — for image-text joint embedding. Our model is a Transformer-based

model, which takes different modalities as input and models the relationship

between them. The model is pre-trained on four tasks simultaneously: Masked

Language Modeling (MLM), Masked Object Classification (MOC), Masked Region

Feature Regression (MRFR), and Image Text Matching (ITM). To further enhance

the pre-training quality, we have collected a Large-scale weAk-supervised

Image-Text (LAIT) dataset from Web. We first pre-train the model on this

dataset, then conduct a second stage pre-training on Conceptual Captions and

SBU Captions. Our experiments show that multi-stage pre-training strategy

outperforms single-stage pre-training. We also fine-tune and evaluate our

pre-trained ImageBERT model on image retrieval and text retrieval tasks, and

achieve new state-of-the-art results on both MSCOCO and Flickr30k datasets.

A Fixation-based 360° Benchmark Dataset for Salient Object Detection

Yi Zhang , Lu Zhang , Wassim Hamidouche , Olivier Deforges

Comments: 5 pages, 5 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Fixation prediction (FP) in panoramic contents has been widely investigated

along with the booming trend of virtual reality (VR) applications. However,

another issue within the field of visual saliency, salient object detection

(SOD), has been seldom explored in 360° (or omnidirectional) images due to

the lack of datasets representative of real scenes with pixel-level

annotations. Toward this end, we collect 107 equirectangular panoramas with

challenging scenes and multiple object classes. Based on the consistency

between FP and explicit saliency judgements, we further manually annotate 1,165

salient objects over the collected images with precise masks under the guidance

of real human eye fixation maps. Six state-of-the-art SOD models are then

benchmarked on the proposed fixation-based 360° image dataset (F-360iSOD),

by applying a multiple cubic projection-based fine-tuning method. Experimental

results show a limitation of the current methods when used for SOD in panoramic

images, which indicates the proposed dataset is challenging. Key issues for

360° SOD is also discussed. The proposed dataset is available at

this https URL .

Optimized Generic Feature Learning for Few-shot Classification across Domains

Tonmoy Saikia , Thomas Brox , Cordelia Schmid Subjects : Computer Vision and Pattern Recognition (cs.CV)

To learn models or features that generalize across tasks and domains is one

of the grand goals of machine learning. In this paper, we propose to use

cross-domain, cross-task data as validation objective for hyper-parameter

optimization (HPO) to improve on this goal. Given a rich enough search space,

optimization of hyper-parameters learn features that maximize validation

performance and, due to the objective, generalize across tasks and domains. We

demonstrate the effectiveness of this strategy on few-shot image classification

within and across domains. The learned features outperform all previous

few-shot and meta-learning approaches.

Dynamic multi-object Gaussian process models: A framework for data-driven functional modelling of human joints

Jean-Rassaire Fouefack , Bhushan Borotikar , Tania S. Douglas , Valérie Burdin , Tinashe E.M. Mutsvangwa

Comments: 15 pages, 14 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Statistical shape models (SSMs) are state-of-the-art medical image analysis

tools for extracting and explaining features across a set of biological

structures. However, a principled and robust way to combine shape and pose

features has been illusive due to three main issues: 1) Non-homogeneity of the

data (data with linear and non-linear natural variation across features), 2)

non-optimal representation of the (3D) motion (rigid transformation

representations that are not proportional to the kinetic energy that move an

object from one position to the other), and 3) artificial discretization of the

models. In this paper, we propose a new framework for dynamic multi-object

statistical modelling framework for the analysis of human joints in a

continuous domain. Specifically, we propose to normalise shape and dynamic

spatial features in the same linearized statistical space permitting the use of

linear statistics; we adopt an optimal 3D motion representation for more

accurate rigid transformation comparisons; and we provide a 3D shape and pose

prediction protocol using a Markov chain Monte Carlo sampling-based fitting.

The framework affords an efficient generative dynamic multi-object modelling

platform for biological joints. We validate the framework using a controlled

synthetic data. Finally, the framework is applied to an analysis of the human

shoulder joint to compare its performance with standard SSM approaches in

prediction of shape while adding the advantage of determining relative pose

between bones in a complex. Excellent validity is observed and the shoulder

joint shape-pose prediction results suggest that the novel framework may have

utility for a range of medical image analysis applications. Furthermore, the

framework is generic and can be extended to n(>)2 objects, making it suitable

for clinical and diagnostic methods for the management of joint disorders.

Partially-Shared Variational Auto-encoders for Unsupervised Domain Adaptation with Target Shift

Ryuhei Takahashi , Masaaki Iiyama , Atsushi Hashimoto , Motoharu Sonogashira Subjects : Computer Vision and Pattern Recognition (cs.CV)

This paper proposes a novel approach for unsupervised domain adaptation (UDA)

with target shift. Target shift is a problem of mismatch in label distribution

between source and target domains. Typically it appears as class-imbalance in

target domain. In practice, this is an important problem in UDA; as we do not

know labels in target domain datasets, we do not know whether or not its

distribution is identical to that in the source domain dataset. Many

traditional approaches achieve UDA with distribution matching by minimizing

mean maximum discrepancy or adversarial training; however these approaches

implicitly assume a coincidence in the distributions and do not work under

situations with target shift. Some recent UDA approaches focus on class

boundary and some of them are robust to target shift, but they are only

applicable to classification and not to regression.

To overcome the target shift problem in UDA, the proposed method, partially

shared variational autoencoders (PS-VAEs), uses pair-wise feature alignment

instead of feature distribution matching. PS-VAEs inter-convert domain of each

sample by a CycleGAN-based architecture while preserving its label-related

content. To evaluate the performance of PS-VAEs, we carried out two

experiments: UDA with class-unbalanced digits datasets (classification), and

UDA from synthesized data to real observation in human-pose-estimation

(regression). The proposed method presented its robustness against the

class-imbalance in the classification task, and outperformed the other methods

in the regression task with a large margin.

Curvature Regularized Surface Reconstruction from Point Cloud

Yuchen He , Sung Ha Kang , Hao Liu

Comments: 22 pages, 15 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

We propose a variational functional and fast algorithms to reconstruct

implicit surface from point cloud data with a curvature constraint. The

minimizing functional balances the distance function from the point cloud and

the mean curvature term. Only the point location is used, without any local

normal or curvature estimation at each point. With the added curvature

constraint, the computation becomes particularly challenging. To enhance the

computational efficiency, we solve the problem by a novel operator splitting

scheme. It replaces the original high-order PDEs by a decoupled PDE system,

which is solved by a semi-implicit method. We also discuss approach using an

augmented Lagrangian method. The proposed method shows robustness against

noise, and recovers concave features and sharp corners better compared to

models without curvature constraint. Numerical experiments in two and three

dimensional data sets, noisy and sparse data are presented to validate the

model.

M^2 Deep-ID: A Novel Model for Multi-View Face Identification Using Convolutional Deep Neural Networks

Sara Shahsavarani , Morteza Analoui , Reza Shoja Ghiass Subjects : Computer Vision and Pattern Recognition (cs.CV)

Despite significant advances in Deep Face Recognition (DFR) systems,

introducing new DFRs under specific constraints such as varying pose still

remains a big challenge. Most particularly, due to the 3D nature of a human

head, facial appearance of the same subject introduces a high intra-class

variability when projected to the camera image plane. In this paper, we propose

a new multi-view Deep Face Recognition (MVDFR) system to address the mentioned

challenge. In this context, multiple 2D images of each subject under different

views are fed into the proposed deep neural network with a unique design to

re-express the facial features in a single and more compact face descriptor,

which in turn, produces a more informative and abstract way for face

identification using convolutional neural networks. To extend the functionality

of our proposed system to multi-view facial images, the golden standard Deep-ID

model is modified in our proposed model. The experimental results indicate that

our proposed method yields a 99.8% accuracy, while the state-of-the-art method

achieves a 97% accuracy. We also gathered the Iran University of Science and

Technology (IUST) face database with 6552 images of 504 subjects to accomplish

our experiments.

LRF-Net: Learning Local Reference Frames for 3D Local Shape Description and Matching

Angfan Zhu , Jiaqi Yang , Chen Zhao , Ke Xian , Zhiguo Cao , Xin Li

Comments: 7 pages, 9 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Machine Learning (cs.LG)

The local reference frame (LRF) acts as a critical role in 3D local shape

description and matching. However, most of existing LRFs are hand-crafted and

suffer from limited repeatability and robustness. This paper presents the first

attempt to learn an LRF via a Siamese network that needs weak supervision only.

In particular, we argue that each neighboring point in the local surface gives

a unique contribution to LRF construction and measure such contributions via

learned weights. Extensive analysis and comparative experiments on three public

datasets addressing different application scenarios have demonstrated that

LRF-Net is more repeatable and robust than several state-of-the-art LRF methods

(LRF-Net is only trained on one dataset). In addition, LRF-Net can

significantly boost the local shape description and 6-DoF pose estimation

performance when matching 3D point clouds.

Depth-Based Selective Blurring in Stereo Images Using Accelerated Framework

Subhayan Mukherjee , Ram Mohana Reddy Guddeti

Comments: arXiv admin note: text overlap with arXiv:2001.06967

Journal-ref: 3D Research (Springer) 5, Article number: 14 (2014)

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Machine Learning (cs.LG); Image and Video Processing (eess.IV)

We propose a hybrid method for stereo disparity estimation by combining block

and region-based stereo matching approaches. It generates dense depth maps from

disparity measurements of only 18 % image pixels (left or right). The

methodology involves segmenting pixel lightness values using fast K-Means

implementation, refining segment boundaries using morphological filtering and

connected components analysis; then determining boundaries’ disparities using

sum of absolute differences (SAD) cost function. Complete disparity maps are

reconstructed from boundaries’ disparities. We consider an application of our

method for depth-based selective blurring of non-interest regions of stereo

images, using Gaussian blur to de-focus users’ non-interest regions.

Experiments on Middlebury dataset demonstrate that our method outperforms

traditional disparity estimation approaches using SAD and normalized cross

correlation by up to 33.6 % and some recent methods by up to 6.1 %. Further,

our method is highly parallelizable using CPU and GPU framework based on Java

Thread Pool and APARAPI with speed-up of 5.8 for 250 stereo video frames (4,096

x 2,304).

Scientific Image Tampering Detection Based On Noise Inconsistencies: A Method And Datasets

Ziyue Xiang , Daniel Acuna Subjects : Computer Vision and Pattern Recognition (cs.CV)

Scientific image tampering is a problem that affects not only authors but

also the general perception of the research community. Although previous

researchers have developed methods to identify tampering in natural images,

these methods may not thrive under the scientific setting as scientific images

have different statistics, format, quality, and intentions. Therefore, we

propose a scientific-image specific tampering detection method based on noise

inconsistencies, which is capable of learning and generalizing to different

fields of science. We train and test our method on a new dataset of manipulated

western blot and microscopy imagery, which aims at emulating problematic images

in science. The test results show that our method can detect various types of

image manipulation in different scenarios robustly, and it outperforms existing

general-purpose image tampering detection schemes. We discuss applications

beyond these two types of images and suggest next steps for making detection of

problematic images a systematic step in peer review and science in general.

Weakly Supervised Temporal Action Localization Using Deep Metric Learning

Ashraful Islam , Richard J. Radke

Comments: accepted to WACV 2020

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Machine Learning (cs.LG)

Temporal action localization is an important step towards video

understanding. Most current action localization methods depend on untrimmed

videos with full temporal annotations of action instances. However, it is

expensive and time-consuming to annotate both action labels and temporal

boundaries of videos. To this end, we propose a weakly supervised temporal

action localization method that only requires video-level action instances as

supervision during training. We propose a classification module to generate

action labels for each segment in the video, and a deep metric learning module

to learn the similarity between different action instances. We jointly optimize

a balanced binary cross-entropy loss and a metric loss using a standard

backpropagation algorithm. Extensive experiments demonstrate the effectiveness

of both of these components in temporal localization. We evaluate our algorithm

on two challenging untrimmed video datasets: THUMOS14 and ActivityNet1.2. Our

approach improves the current state-of-the-art result for THUMOS14 by 6.5% mAP

at IoU threshold 0.5, and achieves competitive performance for ActivityNet1.2.

Deep Depth Prior for Multi-View Stereo

Pallabi Ghosh , Vibhav Vineet , Larry S. Davis , Abhinav Shrivastava , Sudipta Sinha , Neel Joshi Subjects : Computer Vision and Pattern Recognition (cs.CV)

It was recently shown that the structure of convolutional neural networks

induces a strong prior favoring natural color images, a phenomena referred to

as a deep image prior (DIP), which can be an effective regularizer in inverse

problems such as image denoising, inpainting etc. In this paper, we investigate

a similar idea for depth images, which we call a deep depth prior.

Specifically, given a color image and a noisy and incomplete target depth map

from the same viewpoint, we optimize a randomly initialized CNN model to

reconstruct an RGB-D image where the depth channel gets restored by virtue of

using the network structure as a prior. We propose using deep depth priors for

refining and inpainting noisy depth maps within a multi-view stereo pipeline.

We optimize the network parameters to minimize two losses 1) a RGB-D

reconstruction loss based on the noisy depth map and 2) a multi-view

photoconsistency-based loss, which is computed using images from a

geometrically calibrated camera from nearby viewpoints. Our quantitative and

qualitative evaluation shows that our refined depth maps are more accurate and

complete, and after fusion, produces dense 3D models of higher quality.

Lesion Harvester: Iteratively Mining Unlabeled Lesions and Hard-Negative Examples at Scale

Jinzheng Cai , Adam P. Harrison , Youjing Zheng , Ke Yan , Yuankai Huo , Jing Xiao , Lin Yang , Le Lu

Comments: This work has been submitted to the IEEE for possible publication

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Acquiring large-scale medical image data, necessary for training machine

learning algorithms, is frequently intractable, due to prohibitive

expert-driven annotation costs. Recent datasets extracted from hospital

archives, e.g., DeepLesion, have begun to address this problem. However, these

are often incompletely or noisily labeled, e.g., DeepLesion leaves over 50% of

its lesions unlabeled. Thus, effective methods to harvest missing annotations

are critical for continued progress in medical image analysis. This is the goal

of our work, where we develop a powerful system to harvest missing lesions from

the DeepLesion dataset at high precision. Accepting the need for some degree of

expert labor to achieve high fidelity, we exploit a small fully-labeled subset

of medical image volumes and use it to intelligently mine annotations from the

remainder. To do this, we chain together a highly sensitive lesion proposal

generator and a very selective lesion proposal classifier. While our framework

is generic, we optimize our performance by proposing a 3D contextual lesion

proposal generator and by using a multi-view multi-scale lesion proposal

classifier. These produce harvested and hard-negative proposals, which we then

re-use to finetune our proposal generator by using a novel hard negative

suppression loss, continuing this process until no extra lesions are found.

Extensive experimental analysis demonstrates that our method can harvest an

additional 9,805 lesions while keeping precision above 90%. To demonstrate the

benefits of our approach, we show that lesion detectors trained on our

harvested lesions can significantly outperform the same variants only trained

on the original annotations, with boost of average precision of 7% to 10%. We

open source our code and annotations at

this https URL .

Adaptive Loss Function for Super Resolution Neural Networks Using Convex Optimization Techniques

Seyed Mehdi Ayyoubzadeh , Xiaolin Wu Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)

Single Image Super-Resolution (SISR) task refers to learn a mapping from

low-resolution images to the corresponding high-resolution ones. This task is

known to be extremely difficult since it is an ill-posed problem. Recently,

Convolutional Neural Networks (CNNs) have achieved state of the art performance

on SISR. However, the images produced by CNNs do not contain fine details of

the images. Generative Adversarial Networks (GANs) aim to solve this issue and

recover sharp details. Nevertheless, GANs are notoriously difficult to train.

Besides that, they generate artifacts in the high-resolution images. In this

paper, we have proposed a method in which CNNs try to align images in different

spaces rather than only the pixel space. Such a space is designed using convex

optimization techniques. CNNs are encouraged to learn high-frequency components

of the images as well as low-frequency components. We have shown that the

proposed method can recover fine details of the images and it is stable in the

training process.

Block-wise Scrambled Image Recognition Using Adaptation Network

Koki Madono , Masayuki Tanaka , Masaki Onishi , Tetsuji Ogawa

Comments: 6 pages Artificial Intelligence of Things(AAAI-2020 WS)

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI)

In this study, a perceptually hidden object-recognition method is

investigated to generate secure images recognizable by humans but not machines.

Hence, both the perceptual information hiding and the corresponding object

recognition methods should be developed. Block-wise image scrambling is

introduced to hide perceptual information from a third party. In addition, an

adaptation network is proposed to recognize those scrambled images.

Experimental comparisons conducted using CIFAR datasets demonstrated that the

proposed adaptation network performed well in incorporating simple perceptual

information hiding into DNN-based image classification.

EMOPAIN Challenge 2020: Multimodal Pain Evaluation from Facial and Bodily Expressions

Nadia Berthouze , Michel Valstar , Amanda Williams , Joy Egede , Temitayo Olugbade , Chongyang Wang , Hongyin Meng , Min Aung , Nicholas Lane , Siyang Song

Comments: 8 pages

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

The EmoPain 2020 Challenge is the first international competition aimed at

creating a uniform platform for the comparison of machine learning and

multimedia processing methods of automatic chronic pain assessment from human

expressive behaviour, and also the identification of pain-related behaviours.

The objective of the challenge is to promote research in the development of

assistive technologies that help improve the quality of life for people with

chronic pain via real-time monitoring and feedback to help manage their

condition and remain physically active. The challenge also aims to encourage

the use of the relatively underutilised, albeit vital bodily expression signals

for automatic pain and pain-related emotion recognition. This paper presents a

description of the challenge, competition guidelines, bench-marking dataset,

and the baseline systems’ architecture and performance on the three sub-tasks:

pain estimation from facial expressions, pain recognition from multimodal

movement, and protective movement behaviour detection.

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices

Xiaolong Ma , Wei Niu , Tianyun Zhang , Sijia Liu , Fu-Ming Guo , Sheng Lin , Hongjia Li , Xiang Chen , Jian Tang , Kaisheng Ma , Bin Ren , Yanzhi Wang

Comments: arXiv admin note: text overlap with arXiv:1909.05073

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Weight pruning has been widely acknowledged as a straightforward and

effective method to eliminate redundancy in Deep Neural Networks (DNN), thereby

achieving acceleration on various platforms. However, most of the pruning

techniques are essentially trade-offs between model accuracy and regularity

which lead to impaired inference accuracy and limited on-device acceleration

performance. To solve the problem, we introduce a new sparsity dimension,

namely pattern-based sparsity that comprises pattern and connectivity sparsity,

and becoming both highly accurate and hardware friendly. With carefully

designed patterns, the proposed pruning unprecedentedly and consistently

achieves accuracy enhancement and better feature extraction ability on

different DNN structures and datasets, and our pattern-aware pruning framework

also achieves pattern library extraction, pattern selection, pattern and

connectivity pruning and weight training simultaneously. Our approach on the

new pattern-based sparsity naturally fits into compiler optimization for highly

efficient DNN execution on mobile platforms. To the best of our knowledge, it

is the first time that mobile devices achieve real-time inference for the

large-scale DNN models thanks to the unique spatial property of pattern-based

sparsity and the help of the code generation capability of compilers.

Pruning CNN's with linear filter ensembles

Csanád Sándor , Szabolcs Pável , Lehel Csató

Comments: accepted to ECAI2020

Subjects

:

Machine Learning (cs.LG)

; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Despite the promising results of convolutional neural networks (CNNs),

applying them on resource limited devices is still a challenge, mainly due to

the huge memory and computation requirements. To tackle these problems, pruning

can be applied to reduce the network size and number of floating point

operations (FLOPs). Contrary to the emph{filter norm} method — that is used

in network pruning and uses the assumption that the smaller this norm, the less

important is the associated component –, we develop a novel filter importance

norm that incorporates the loss caused by the elimination of a component from

the CNN.

To estimate the importance of a set of architectural components, we measure

the CNN performance as different components are removed. The result is a

collection of filter ensembles — filter masks — and associated performance

values. We rank the filters based on a linear and additive model and remove the

least important ones such that the drop in network accuracy is minimal. We

evaluate our method on a fully connected network, as well as on the ResNet

architecture trained on the CIFAR-10 data-set. Using our pruning method, we

managed to remove (60\%) of the parameters and (64\%) of the FLOPs from the

ResNet with an accuracy drop of less than (0.6\%).

Optimizing Generative Adversarial Networks for Image Super Resolution via Latent Space Regularization

Sheng Zhong , Shifu Zhou (Agora.io)

Comments: 11 pages, 5 figures

Subjects

:

Image and Video Processing (eess.IV)

; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Natural images can be regarded as residing in a manifold that is embedded in

a higher dimensional Euclidean space. Generative Adversarial Networks (GANs)

try to learn the distribution of the real images in the manifold to generate

samples that look real. But the results of existing methods still exhibit many

unpleasant artifacts and distortions even for the cases where the desired

ground truth target images are available for supervised learning such as in

single image super resolution (SISR). We probe for ways to alleviate these

problems for supervised GANs in this paper. We explicitly apply the Lipschitz

Continuity Condition (LCC) to regularize the GAN. An encoding network that maps

the image space to a new optimal latent space is derived from the LCC, and it

is used to augment the GAN as a coupling component. The LCC is also converted

to new regularization terms in the generator loss function to enforce local

invariance. The GAN is optimized together with the encoding network in an

attempt to make the generator converge to a more ideal and disentangled mapping

that can generate samples more faithful to the target images. When the proposed

models are applied to the single image super resolution problem, the results

outperform the state of the art.

DeepFL-IQA: Weak Supervision for Deep IQA Feature Learning

Hanhe Lin , Vlad Hosu , Dietmar Saupe

Comments: dataset url: this http URL

Subjects

:

Image and Video Processing (eess.IV)

; Computer Vision and Pattern Recognition (cs.CV)

Multi-level deep-features have been driving state-of-the-art methods for

aesthetics and image quality assessment (IQA). However, most IQA benchmarks are

comprised of artificially distorted images, for which features derived from

ImageNet under-perform. We propose a new IQA dataset and a weakly supervised

feature learning approach to train features more suitable for IQA of

artificially distorted images. The dataset, KADIS-700k, is far more extensive

than similar works, consisting of 140,000 pristine images, 25 distortions

types, totaling 700k distorted versions. Our weakly supervised feature learning

is designed as a multi-task learning type training, using eleven existing

full-reference IQA metrics as proxies for differential mean opinion scores. We

also introduce a benchmark database, KADID-10k, of artificially degraded

images, each subjectively annotated by 30 crowd workers. We make use of our

derived image feature vectors for (no-reference) image quality assessment by

training and testing a shallow regression network on this database and five

other benchmark IQA databases. Our method, termed DeepFL-IQA, performs better

than other feature-based no-reference IQA methods and also better than all

tested full-reference IQA methods on KADID-10k. For the other five benchmark

IQA databases, DeepFL-IQA matches the performance of the best existing

end-to-end deep learning-based methods on average.

ManyModalQA: Modality Disambiguation and QA over Diverse Inputs

Darryl Hannan , Akshay Jain , Mohit Bansal

Comments: AAAI 2020 (10 pages)

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

We present a new multimodal question answering challenge, ManyModalQA, in

which an agent must answer a question by considering three distinct modalities:

text, images, and tables. We collect our data by scraping Wikipedia and then

utilize crowdsourcing to collect question-answer pairs. Our questions are

ambiguous, in that the modality that contains the answer is not easily

determined based solely upon the question. To demonstrate this ambiguity, we

construct a modality selector (or disambiguator) network, and this model gets

substantially lower accuracy on our challenge set, compared to existing

datasets, indicating that our questions are more ambiguous. By analyzing this

model, we investigate which words in the question are indicative of the

modality. Next, we construct a simple baseline ManyModalQA model, which, based

on the prediction from the modality selector, fires a corresponding pre-trained

state-of-the-art unimodal QA model. We focus on providing the community with a

new manymodal evaluation set and only provide a fine-tuning set, with the

expectation that existing datasets and approaches will be transferred for most

of the training, to encourage low-resource generalization without large,

monolithic training sets for each new task. There is a significant gap between

our baseline models and human performance; therefore, we hope that this

challenge encourages research in end-to-end modality disambiguation and

multimodal QA models, as well as transfer learning. Code and data available at:

this https URL

Safety Concerns and Mitigation Approaches Regarding the Use of Deep Learning in Safety-Critical Perception Tasks

Oliver Willers , Sebastian Sudholt , Shervin Raafatnia , Stephanie Abrecht Subjects : Machine Learning (cs.LG) ; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Deep learning methods are widely regarded as indispensable when it comes to

designing perception pipelines for autonomous agents such as robots, drones or

automated vehicles. The main reasons, however, for deep learning not being used

for autonomous agents at large scale already are safety concerns. Deep learning

approaches typically exhibit a black-box behavior which makes it hard for them

to be evaluated with respect to safety-critical aspects. While there have been

some work on safety in deep learning, most papers typically focus on high-level

safety concerns. In this work, we seek to dive into the safety concerns of deep

learning methods and present a concise enumeration on a deeply technical level.

Additionally, we present extensive discussions on possible mitigation methods

and give an outlook regarding what mitigation methods are still missing in

order to facilitate an argumentation for the safety of a deep learning method.

Anomaly detection in chest radiographs with a weakly supervised flow-based deep learning method

H. Shibata (1), S. Hanaoka (2), Y. Nomura (1), T. Nakao (3), I. Sato (2 and 4 and 5), N. Hayashi (1), O. Abe (2 and 3) ((1) Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, (2) Department of Radiology, The University of Tokyo Hospital, (3) Division of Radiology and Biomedical Engineering, Graduate School of Medicine, The University of Tokyo, (4) Department of Complexity Science and Engineering, Graduate School of Frontier Sciences, The University of Tokyo, (5) Center for Advanced Intelligence Project, RIKEN) Subjects : Image and Video Processing (eess.IV) ; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Preventing the oversight of anomalies in chest X-ray radiographs (CXRs)

during diagnosis is a crucial issue. Deep learning (DL)-based anomaly detection

methods are rapidly growing in popularity, and provide effective solutions to

the problem, but the workload in labeling CXRs during the training procedure

remains heavy. To reduce the workload, a novel anomaly detection method for

CXRs based on weakly supervised DL is presented in this study. The DL is based

on a flow-based deep neural network (DNN) framework with which two normality

metrics (logarithm likelihood and logarithm likelihood ratio) can be

calculated. With this method, only one set of normal CXRs requires labeling to

train the DNN, then the normality of any unknown CXR can be evaluated. The area

under the receiver operation characteristic curve acquired with the logarithm

likelihood ratio metric ((approx0.783)) was greater than that obtained with

the logarithm likelihood metric, and was a value comparable to those in

previous studies where other weakly supervised DNNs were implemented.

GhostImage: Perception Domain Attacks against Vision-based Object Classification Systems

Yanmao Man , Ming Li , Ryan Gerdes Subjects : Cryptography and Security (cs.CR) ; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

In vision-based object classification systems, imaging sensors perceive the

environment and then objects are detected and classified for decision-making

purposes. Vulnerabilities in the perception domain enable an attacker to inject

false data into the sensor which could lead to unsafe consequences. In this

work, we focus on camera-based systems and propose GhostImage attacks, with the

goal of either creating a fake perceived object or obfuscating the object’s

image that leads to wrong classification results. This is achieved by remotely

projecting adversarial patterns into camera-perceived images, exploiting two

common effects in optical imaging systems, namely lens flare/ghost effects, and

auto-exposure control. To improve the robustness of the attack to channel

perturbations, we generate optimal input patterns by integrating adversarial

machine learning techniques with a trained end-to-end channel model. We realize

GhostImage attacks with a projector, and conducted comprehensive experiments,

using three different image datasets, in indoor and outdoor environments, and

three different cameras. We demonstrate that GhostImage attacks are applicable

to both autonomous driving and security surveillance scenarios. Experiment

results show that, depending on the projector-camera distance, attack success

rates can reach as high as 100%.

TEASER: Fast and Certifiable Point Cloud Registration

Heng Yang , Jingnan Shi , Luca Carlone

Comments: 20 pages main text, 22 pages appendix

Subjects

:

Robotics (cs.RO)

; Computer Vision and Pattern Recognition (cs.CV); Optimization and Control (math.OC)

We propose the first fast and certifiable algorithm for the registration of

two sets of 3D points in the presence of large amounts of outlier

correspondences. Towards this goal, we first reformulate the registration

problem using a Truncated Least Squares (TLS) cost that makes the estimation

insensitive to spurious correspondences. Then, we provide a general

graph-theoretic framework to decouple scale, rotation, and translation

estimation, which allows solving in cascade for the three transformations.

Despite the fact that each subproblem is still non-convex and combinatorial in

nature, we show that (i) TLS scale and (component-wise) translation estimation

can be solved in polynomial time via an adaptive voting scheme, (ii) TLS

rotation estimation can be relaxed to a semidefinite program (SDP) and the

relaxation is tight, even in the presence of extreme outlier rates. We name the

resulting algorithm TEASER (Truncated least squares Estimation And SEmidefinite

Relaxation). While solving large SDP relaxations is typically slow, we develop

a second certifiable algorithm, named TEASER++, that circumvents the need to

solve an SDP and runs in milliseconds. For both algorithms, we provide

theoretical bounds on the estimation errors, which are the first of their kind

for robust registration problems. Moreover, we test their performance on

standard benchmarks, object detection datasets, and the 3DMatch scan matching

dataset, and show that (i) both algorithms dominate the state of the art (e.g.,

RANSAC, branch-&-bound, heuristics) and are robust to more than 99% outliers,

(ii) TEASER++ can run in milliseconds and it is currently the fastest robust

registration algorithm, (iii) TEASER++ is so robust it can also solve problems

without correspondences (e.g., hypothesizing all-to-all correspondences) where

it largely outperforms ICP. We release a fast open-source C++ implementation of

TEASER++.

Artificial Intelligence

StarAI: Reducing incompleteness in the game of Bridge using PLP

J Li , S Thepaut , V Ventos Subjects : Artificial Intelligence (cs.AI)

Bridge is a trick-taking card game requiring the ability to evaluate

probabilities since it is a game of incomplete information where each player

only sees its cards. In order to choose a strategy, a player needs to gather

information about the hidden cards in the other players’ hand. We present a

methodology allowing us to model a part of card playing in Bridge using

Probabilistic Logic Programming.

DeepEnroll: Patient-Trial Matching with Deep Embeddingand Entailment Prediction

Xingyao Zhang , Cao Xiao , Lucas M. Glass , Jimeng Sun

Comments: accepted by The World Wide Web Conference 2020

Subjects

:

Artificial Intelligence (cs.AI)

Clinical trials are essential for drug development but often suffer from

expensive, inaccurate and insufficient patient recruitment. The core problem of

patient-trial matching is to find qualified patients for a trial, where patient

information is stored in electronic health records (EHR) while trial

eligibility criteria (EC) are described in text documents available on the web.

How to represent longitudinal patient EHR? How to extract complex logical rules

from EC? Most existing works rely on manual rule-based extraction, which is

time consuming and inflexible for complex inference. To address these

challenges, we proposed DeepEnroll, a cross-modal inference learning model to

jointly encode enrollment criteria (text) and patients records (tabular data)

into a shared latent space for matching inference. DeepEnroll applies a

pre-trained Bidirectional Encoder Representations from Transformers(BERT) model

to encode clinical trial information into sentence embedding. And uses a

hierarchical embedding model to represent patient longitudinal EHR. In

addition, DeepEnroll is augmented by a numerical information embedding and

entailment module to reason over numerical information in both EC and EHR.

These encoders are trained jointly to optimize patient-trial matching score. We

evaluated DeepEnroll on the trial-patient matching task with demonstrated on

real world datasets. DeepEnroll outperformed the best baseline by up to 12.4%

in average F1.

Accelerating supply chains with Ant Colony Optimization across range of hardware solutions

Ivars Dzalbs , Tatiana Kalganova Subjects : Artificial Intelligence (cs.AI) ; Distributed, Parallel, and Cluster Computing (cs.DC); Neural and Evolutionary Computing (cs.NE)

Ant Colony algorithm has been applied to various optimization problems,

however most of the previous work on scaling and parallelism focuses on

Travelling Salesman Problems (TSPs). Although, useful for benchmarks and new

idea comparison, the algorithmic dynamics does not always transfer to complex

real-life problems, where additional meta-data is required during solution

construction. This paper looks at real-life outbound supply chain problem using

Ant Colony Optimization (ACO) and its scaling dynamics with two parallel ACO

architectures – Independent Ant Colonies (IAC) and Parallel Ants (PA). Results

showed that PA was able to reach a higher solution quality in fewer iterations

as the number of parallel instances increased. Furthermore, speed performance

was measured across three different hardware solutions – 16 core CPU, 68 core

Xeon Phi and up to 4 Geforce GPUs. State of the art, ACO vectorization

techniques such as SS-Roulette were implemented using C++ and CUDA. Although

excellent for TSP, it was concluded that for the given supply chain problem

GPUs are not suitable due to meta-data access footprint required. Furthermore,

compared to their sequential counterpart, vectorized CPU AVX2 implementation

achieved 25.4x speedup on CPU while Xeon Phi with its AVX512 instruction set

reached 148x on PA with Vectorized (PAwV). PAwV is therefore able to scale at

least up to 1024 parallel instances on the supply chain network problem solved.

Algorithms for Tensor Network Contraction Ordering

Frank Schindler , Adam S. Jermyn

Comments: 10 pages, 10 figures

Subjects

:

Artificial Intelligence (cs.AI)

; Numerical Analysis (math.NA); Computational Physics (physics.comp-ph); Quantum Physics (quant-ph)

Contracting tensor networks is often computationally demanding. Well-designed

contraction sequences can dramatically reduce the contraction cost. We explore

the performance of simulated annealing and genetic algorithms, two common

discrete optimization techniques, to this ordering problem. We benchmark their

performance as well as that of the commonly-used greedy search on physically

relevant tensor networks. Where computationally feasible, we also compare them

with the optimal contraction sequence obtained by an exhaustive search. We find

that the algorithms we consider consistently outperform a greedy search given

equal computational resources, with an advantage that scales with tensor

network size. We compare the obtained contraction sequences and identify signs

of highly non-local optimization, with the more sophisticated algorithms

sacrificing run-time early in the contraction for better overall performance.

A Neural Architecture for Person Ontology population

Balaji Ganesan , Riddhiman Dasgupta , Akshay Parekh , Hima Patel , Berthold Reinwald

Comments: 6 pages, 10 figures. arXiv admin note: substantial text overlap with arXiv:1811.09368

Subjects

:

Artificial Intelligence (cs.AI)

A person ontology comprising concepts, attributes and relationships of people

has a number of applications in data protection, didentification, population of

knowledge graphs for business intelligence and fraud prevention. While

artificial neural networks have led to improvements in Entity Recognition,

Entity Classification, and Relation Extraction, creating an ontology largely

remains a manual process, because it requires a fixed set of semantic relations

between concepts. In this work, we present a system for automatically

populating a person ontology graph from unstructured data using neural models

for Entity Classification and Relation Extraction. We introduce a new dataset

for these tasks and discuss our results.

Benchmarking Symbolic Execution Using Constraint Problems — Initial Results

Sahil Verma , Roland H.C. Yap

Journal-ref: ICTAI 2019

Subjects

:

Artificial Intelligence (cs.AI)

; Software Engineering (cs.SE)

Symbolic execution is a powerful technique for bug finding and program

testing. It is successful in finding bugs in real-world code. The core

reasoning techniques use constraint solving, path exploration, and search,

which are also the same techniques used in solving combinatorial problems,

e.g., finite-domain constraint satisfaction problems (CSPs). We propose CSP

instances as more challenging benchmarks to evaluate the effectiveness of the

core techniques in symbolic execution. We transform CSP benchmarks into C

programs suitable for testing the reasoning capabilities of symbolic execution

tools. From a single CSP P, we transform P depending on transformation choice

into different C programs. Preliminary testing with the KLEE, Tracer-X, and

LLBMC tools show substantial runtime differences from transformation and solver

choice. Our C benchmarks are effective in showing the limitations of existing

symbolic execution tools. The motivation for this work is we believe that

benchmarks of this form can spur the development and engineering of improved

core reasoning in symbolic execution engines.

An Approach for Time-aware Domain-based Social Influence Prediction

Bilal Abu-Salih , Kit Yan Chan , Omar Al-Kadi , Marwan Al-Tawil , Pornpit Wongthongtham , Tomayess Issa , Heba Saadeh , Malak Al-Hassan , Bushra Bremie , Abdulaziz Albahlal Subjects : Artificial Intelligence (cs.AI)

Online Social Networks(OSNs) have established virtual platforms enabling

people to express their opinions, interests and thoughts in a variety of

contexts and domains, allowing legitimate users as well as spammers and other

untrustworthy users to publish and spread their content. Hence, the concept of

social trust has attracted the attention of information processors/data

scientists and information consumers/business firms. One of the main reasons

for acquiring the value of Social Big Data (SBD) is to provide frameworks and

methodologies using which the credibility of OSNs users can be evaluated. These

approaches should be scalable to accommodate large-scale social data. Hence,

there is a need for well comprehending of social trust to improve and expand

the analysis process and inferring the credibility of SBD. Given the exposed

environment’s settings and fewer limitations related to OSNs, the medium allows

legitimate and genuine users as well as spammers and other low trustworthy

users to publish and spread their content. Hence, this paper presents an

approach incorporates semantic analysis and machine learning modules to measure

and predict users’ trustworthiness in numerous domains in different time

periods. The evaluation of the conducted experiment validates the applicability

of the incorporated machine learning techniques to predict highly trustworthy

domain-based users.

A Journey into Ontology Approximation: From Non-Horn to Hon

Anneke Haga , Carsten Lutz , Johannes Marti , Frank Wolter

Comments: 20 pages, 4 figures, submitted to ijcai2020

Subjects

:

Artificial Intelligence (cs.AI)

We study complete approximations of an ontology formulated in a non-Horn

description logic (DL) such as (mathcal{ALC}) in a Horn DL such

as~(mathcal{EL}). We provide concrete approximation schemes that are

necessarily infinite and observe that in the (mathcal{ELU})-to-(mathcal{EL})

case finite approximations tend to exist in practice and are guaranteed to

exist when the original ontology is acyclic. In contrast, neither of this is

the case for (mathcal{ELU}_ot)-to-(mathcal{EL}_ot) and for

(mathcal{ALC})-to-(mathcal{EL}_ot) approximations. We also define a notion

of approximation tailored towards ontology-mediated querying, connect it to

subsumption-based approximations, and identify a case where finite

approximations are guaranteed to exist.

Emergence of Pragmatics from Referential Game between Theory of Mind Agents

Luyao Yuan , Zipeng Fu , Jingyue Shen , Lu Xu , Junhong Shen , Song-Chun Zhu Subjects : Artificial Intelligence (cs.AI) ; Computation and Language (cs.CL); Machine Learning (cs.LG); Multiagent Systems (cs.MA)

Pragmatics studies how context can contribute to language meanings [1]. In

human communication, language is never interpreted out of context, and

sentences can usually convey more information than their literal meanings [2].

However, this mechanism is missing in most multi-agent systems [3, 4, 5, 6],

restricting the communication efficiency and the capability of human-agent

interaction. In this paper, we propose an algorithm, using which agents can

spontaneously learn the ability to “read between lines” without any explicit

hand-designed rules. We integrate the theory of mind (ToM) [7, 8] in a

cooperative multi-agent pedagogical situation and propose an adaptive

reinforcement learning (RL) algorithm to develop a communication protocol. ToM

is a profound cognitive science concept, claiming that people regularly reason

about other’s mental states, including beliefs, goals, and intentions, to

obtain performance advantage in competition, cooperation or coalition. With

this ability, agents consider language as not only messages but also rational

acts reflecting others’ hidden states. Our experiments demonstrate the

advantage of pragmatic protocols over non-pragmatic protocols. We also show the

teaching complexity following the pragmatic protocol empirically approximates

to recursive teaching dimension (RTD).

Adaptive Large Neighborhood Search for Circle Bin Packing Problem

Kun He , Kevin Tole , Fei Ni , Yong Yuan , Linyun Liao

Comments: 13 pages, 6 figures, 6 tables

Subjects

:

Artificial Intelligence (cs.AI)

; Distributed, Parallel, and Cluster Computing (cs.DC)

We address a new variant of packing problem called the circle bin packing

problem (CBPP), which is to find a dense packing of circle items to multiple

square bins so as to minimize the number of used bins. To this end, we propose

an adaptive large neighborhood search (ALNS) algorithm, which uses our Greedy

Algorithm with Corner Occupying Action (GACOA) to construct an initial layout.

The greedy solution is usually in a local optimum trap, and ALNS enables

multiple neighborhood search that depends on the stochastic annealing schedule

to avoid getting stuck in local minimum traps. Specifically, ALNS perturbs the

current layout to jump out of a local optimum by iteratively reassigns some

circles and accepts the new layout with some probability during the search. The

acceptance probability is adjusted adaptively using simulated annealing that

fine-tunes the search direction in order to reach the global optimum. We

benchmark computational results against GACOA in heterogeneous instances. ALNS

always outperforms GACOA in improving the objective function, and in several

cases, there is a significant reduction on the number of bins used in the

packing.

Complexity, Stability Properties Of Mixed Games and Dynamic Algorithms, And Learning In The Sharing Economy

Michael C. Nwogugu Subjects : Theoretical Economics (econ.TH) ; Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Dynamical Systems (math.DS)

The Sharing Economy (which includes Airbnb, Apple, Alibaba, Uber, WeWork,

Ebay, Didi Chuxing, Amazon) blossomed across the world, triggered structural

changes in industries and significantly affected international capital flows

primarily by disobeying a wide variety of statutes and laws in many countries.

They also illegally reduced and changing the nature of competition in many

industries often to the detriment of social welfare. This article develops new

dynamic pricing models for the SEOs and derives some stability properties of

mixed games and dynamic algorithms which eliminate antitrust liability and also

reduce deadweight losses, greed, Regret and GPS manipulation. The new dynamic

pricing models contravene the Myerson Satterthwaite Impossibility Theorem.

Automatic phantom test pattern classification through transfer learning with deep neural networks

Rafael B. Fricks , Justin Solomon , Ehsan Samei Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Medical Physics (physics.med-ph)

Imaging phantoms are test patterns used to measure image quality in computer

tomography (CT) systems. A new phantom platform (Mercury Phantom, Gammex)

provides test patterns for estimating the task transfer function (TTF) or noise

power spectrum (NPF) and simulates different patient sizes. Determining which

image slices are suitable for analysis currently requires manual annotation of

these patterns by an expert, as subtle defects may make an image unsuitable for

measurement. We propose a method of automatically classifying these test

patterns in a series of phantom images using deep learning techniques. By

adapting a convolutional neural network based on the VGG19 architecture with

weights trained on ImageNet, we use transfer learning to produce a classifier

for this domain. The classifier is trained and evaluated with over 3,500

phantom images acquired at a university medical center. Input channels for

color images are successfully adapted to convey contextual information for

phantom images. A series of ablation studies are employed to verify design

aspects of the classifier and evaluate its performance under varying training

conditions. Our solution makes extensive use of image augmentation to produce a

classifier that accurately classifies typical phantom images with 98% accuracy,

while maintaining as much as 86% accuracy when the phantom is improperly

imaged.

A utility-based analysis of equilibria in multi-objective normal form games

Roxana Rădulescu , Patrick Mannion , Yijie Zhang , Diederik M. Roijers , Ann Nowé

Comments: Under review since 16 January 2020

Subjects

:

Computer Science and Game Theory (cs.GT)

; Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA)

In multi-objective multi-agent systems (MOMAS), agents explicitly consider

the possible tradeoffs between conflicting objective functions. We argue that

compromises between competing objectives in MOMAS should be analysed on the

basis of the utility that these compromises have for the users of a system,

where an agent’s utility function maps their payoff vectors to scalar utility

values. This utility-based approach naturally leads to two different

optimisation criteria for agents in a MOMAS: expected scalarised returns (ESR)

and scalarised expected returns (SER). In this article, we explore the

differences between these two criteria using the framework of multi-objective

normal form games (MONFGs). We demonstrate that the choice of optimisation

criterion (ESR or SER) can radically alter the set of equilibria in a MONFG

when non-linear utility functions are used.

Causality based Feature Fusion for Brain NeuroDevelopmental Analysis

Peyman Hosseinzadeh Kassani , Li Xiao , Gemeng Zhang , Julia M. Stephen , Tony W. Wilson , Vince D. Calhoun , Yu Ping Wang

Comments: 10 pages

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI)

Human brain development is a complex and dynamic process that is affected by

several factors such as genetics, sex hormones, and environmental changes. A

number of recent studies on brain development have examined functional

connectivity (FC) defined by the temporal correlation between time series of

different brain regions. We propose to add the directional flow of information

during brain maturation. To do so, we extract effective connectivity (EC)

through Granger causality (GC) for two different groups of subjects, i.e.,

children and young adults. The motivation is that the inclusion of causal

interaction may further discriminate brain connections between two age groups

and help to discover new connections between brain regions. The contributions

of this study are threefold. First, there has been a lack of attention to

EC-based feature extraction in the context of brain development. To this end,

we propose a new kernel-based GC (KGC) method to learn nonlinearity of complex

brain network, where a reduced Sine hyperbolic polynomial (RSP) neural network

was used as our proposed learner. Second, we used causality values as the

weight for the directional connectivity between brain regions. Our findings

indicated that the strength of connections was significantly higher in young

adults relative to children. In addition, our new EC-based feature outperformed

FC-based analysis from Philadelphia neurocohort (PNC) study with better

discrimination of the different age groups. Moreover, the fusion of these two

sets of features (FC + EC) improved brain age prediction accuracy by more than

4%, indicating that they should be used together for brain development studies.

Q-Learning in enormous action spaces via amortized approximate maximization

Tom Van de Wiele , David Warde-Farley , Andriy Mnih , Volodymyr Mnih

Comments: A previous version of this work appeared at the Deep Reinforcement Learning Workshop, NeurIPS 2018

Subjects

:

Machine Learning (cs.LG)

; Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

Applying Q-learning to high-dimensional or continuous action spaces can be

difficult due to the required maximization over the set of possible actions.

Motivated by techniques from amortized inference, we replace the expensive

maximization over all actions with a maximization over a small subset of

possible actions sampled from a learned proposal distribution. The resulting

approach, which we dub Amortized Q-learning (AQL), is able to handle discrete,

continuous, or hybrid action spaces while maintaining the benefits of

Q-learning. Our experiments on continuous control tasks with up to 21

dimensional actions show that AQL outperforms D3PG (Barth-Maron et al, 2018)

and QT-Opt (Kalashnikov et al, 2018). Experiments on structured discrete action

spaces demonstrate that AQL can efficiently learn good policies in spaces with

thousands of discrete actions.

Secure and Robust Machine Learning for Healthcare: A Survey

Adnan Qayyum , Junaid Qadir , Muhammad Bilal , Ala Al-Fuqaha Subjects : Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV); Machine Learning (stat.ML)

Recent years have witnessed widespread adoption of machine learning (ML)/deep

learning (DL) techniques due to their superior performance for a variety of

healthcare applications ranging from the prediction of cardiac arrest from

one-dimensional heart signals to computer-aided diagnosis (CADx) using

multi-dimensional medical images. Notwithstanding the impressive performance of

ML/DL, there are still lingering doubts regarding the robustness of ML/DL in

healthcare settings (which is traditionally considered quite challenging due to

the myriad security and privacy issues involved), especially in light of recent

results that have shown that ML/DL are vulnerable to adversarial attacks. In

this paper, we present an overview of various application areas in healthcare

that leverage such techniques from security and privacy point of view and

present associated challenges. In addition, we present potential methods to

ensure secure and privacy-preserving ML for healthcare applications. Finally,

we provide insight into the current research challenges and promising

directions for future research.

ManyModalQA: Modality Disambiguation and QA over Diverse Inputs

Darryl Hannan , Akshay Jain , Mohit Bansal

Comments: AAAI 2020 (10 pages)

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

We present a new multimodal question answering challenge, ManyModalQA, in

which an agent must answer a question by considering three distinct modalities:

text, images, and tables. We collect our data by scraping Wikipedia and then

utilize crowdsourcing to collect question-answer pairs. Our questions are

ambiguous, in that the modality that contains the answer is not easily

determined based solely upon the question. To demonstrate this ambiguity, we

construct a modality selector (or disambiguator) network, and this model gets

substantially lower accuracy on our challenge set, compared to existing

datasets, indicating that our questions are more ambiguous. By analyzing this

model, we investigate which words in the question are indicative of the

modality. Next, we construct a simple baseline ManyModalQA model, which, based

on the prediction from the modality selector, fires a corresponding pre-trained

state-of-the-art unimodal QA model. We focus on providing the community with a

new manymodal evaluation set and only provide a fine-tuning set, with the

expectation that existing datasets and approaches will be transferred for most

of the training, to encourage low-resource generalization without large,

monolithic training sets for each new task. There is a significant gap between

our baseline models and human performance; therefore, we hope that this

challenge encourages research in end-to-end modality disambiguation and

multimodal QA models, as well as transfer learning. Code and data available at:

this https URL

Subjective Knowledge and Reasoning about Agents in Multi-Agent Systems

Shikha Singh , Deepak Khemani Subjects : Multiagent Systems (cs.MA) ; Artificial Intelligence (cs.AI)

Though a lot of work in multi-agent systems is focused on reasoning about

knowledge and beliefs of artificial agents, an explicit representation and

reasoning about the presence/absence of agents, especially in the scenarios

where agents may be unaware of other agents joining in or going offline in a

multi-agent system, leading to partial knowledge/asymmetric knowledge of the

agents is mostly overlooked by the MAS community. Such scenarios lay the

foundations of cases where an agent can influence other agents’ mental states

by (mis)informing them about the presence/absence of collaborators or

adversaries. In this paper, we investigate how Kripke structure-based epistemic

models can be extended to express the above notion based on an agent’s

subjective knowledge and we discuss the challenges that come along.

ARAACOM: ARAbic Algerian Corpus for Opinion Mining

Zitouni Abdelhafid (LIRE), Hichem Rahab (ICOSI, LIRE), Abdelhafid Zitouni (LIRE), Mahieddine Djoudi (TECHNÉ – EA 6316)

Journal-ref: ICCES ’17: Proceedings of the International Conference on

Computing for Engineering and Sciences, Jul 2017, Istanbul, France. pp.35-39

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

Nowadays, it is no more needed to do an enormous effort to distribute a lot

of forms to thousands of people and collect them, then convert this from into

electronic format to track people opinion about some subjects. A lot of web

sites can today reach a large spectrum with less effort. The majority of web

sites suggest to their visitors to leave backups about their feeling of the

site or events. So, this makes for us a lot of data which need powerful mean to

exploit. Opinion mining in the web becomes more and more an attracting task,

due the increasing need for individuals and societies to track the mood of

people against several subjects of daily life (sports, politics,

television,…). A lot of works in opinion mining was developed in western

languages especially English, such works in Arabic language still very scarce.

In this paper, we propose our approach, for opinion mining in Arabic Algerian

news paper. CCS CONCEPTS (ullet)Information systems~Sentiment analysis

(ullet) Computing methodologies~Natural language processing

On Solving Cooperative MARL Problems with a Few Good Experiences

Rajiv Ranjan Kumar , Pradeep Varakantham Subjects : Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)

Cooperative Multi-agent Reinforcement Learning (MARL) is crucial for

cooperative decentralized decision learning in many domains such as search and

rescue, drone surveillance, package delivery and fire fighting problems. In

these domains, a key challenge is learning with a few good experiences, i.e.,

positive reinforcements are obtained only in a few situations (e.g., on

extinguishing a fire or tracking a crime or delivering a package) and in most

other situations there is zero or negative reinforcement. Learning decisions

with a few good experiences is extremely challenging in cooperative MARL

problems due to three reasons. First, compared to the single agent case,

exploration is harder as multiple agents have to be coordinated to receive a

good experience. Second, environment is not stationary as all the agents are

learning at the same time (and hence change policies). Third, scale of problem

increases significantly with every additional agent.

Relevant existing work is extensive and has focussed on dealing with a few

good experiences in single-agent RL problems or on scalable approaches for

handling non-stationarity in MARL problems. Unfortunately, neither of these

approaches (or their extensions) are able to address the problem of sparse good

experiences effectively. Therefore, we provide a novel fictitious self

imitation approach that is able to simultaneously handle non-stationarity and

sparse good experiences in a scalable manner. Finally, we provide a thorough

comparison (experimental or descriptive) against relevant cooperative MARL

algorithms to demonstrate the utility of our approach.

Get Rid of Suspended Animation Problem: Deep Diffusive Neural Network on Graph Semi-Supervised Classification

Jiawei Zhang

Comments: 7 pages, 6 figures

Subjects

:

Machine Learning (cs.LG)

; Artificial Intelligence (cs.AI); Information Theory (cs.IT); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

Existing graph neural networks may suffer from the “suspended animation

problem” when the model architecture goes deep. Meanwhile, for some graph

learning scenarios, e.g., nodes with text/image attributes or graphs with

long-distance node correlations, deep graph neural networks will be necessary

for effective graph representation learning. In this paper, we propose a new

graph neural network, namely DIFNET (Graph Diffusive Neural Network), for graph

representation learning and node classification. DIFNET utilizes both neural

gates and graph residual learning for node hidden state modeling, and includes

an attention mechanism for node neighborhood information diffusion. Extensive

experiments will be done in this paper to compare DIFNET against several

state-of-the-art graph neural network models. The experimental results can

illustrate both the learning performance advantages and effectiveness of

DIFNET, especially in addressing the “suspended animation problem”.

Convergence Time Optimization for Federated Learning over Wireless Networks

Mingzhe Chen , H. Vincent Poor , Walid Saad , Shuguang Cui Subjects : Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI); Networking and Internet Architecture (cs.NI); Machine Learning (stat.ML)

In this paper, the convergence time of federated learning (FL), when deployed

over a realistic wireless network, is studied. In particular, a wireless

network is considered in which wireless users transmit their local FL models

(trained using their locally collected data) to a base station (BS). The BS,

acting as a central controller, generates a global FL model using the received

local FL models and broadcasts it back to all users. Due to the limited number

of resource blocks (RBs) in a wireless network, only a subset of users can be

selected to transmit their local FL model parameters to the BS at each learning

step. Moreover, since each user has unique training data samples, the BS

prefers to include all local user FL models to generate a converged global FL

model. Hence, the FL performance and convergence time will be significantly

affected by the user selection scheme. Therefore, it is necessary to design an

appropriate user selection scheme that enables users of higher importance to be

selected more frequently. This joint learning, wireless resource allocation,

and user selection problem is formulated as an optimization problem whose goal

is to minimize the FL convergence time while optimizing the FL performance. To

solve this problem, a probabilistic user selection scheme is proposed such that

the BS is connected to the users whose local FL models have significant effects

on its global FL model with high probabilities. Given the user selection

policy, the uplink RB allocation can be determined. To further reduce the FL

convergence time, artificial neural networks (ANNs) are used to estimate the

local FL models of the users that are not allocated any RBs for local FL model

transmission at each given learning step, which enables the BS to enhance its

global FL model and improve the FL convergence speed and performance.

Coarse-Grain Cluster Analysis of Tensors With Application to Climate Biome Identification

Derek DeSantis , Phillip J. Wolfram , Katrina Bennett , Boian Alexandrov Subjects : Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI); Information Theory (cs.IT); Machine Learning (stat.ML)

A tensor provides a concise way to codify the interdependence of complex

data. Treating a tensor as a d-way array, each entry records the interaction

between the different indices. Clustering provides a way to parse the

complexity of the data into more readily understandable information. Clustering

methods are heavily dependent on the algorithm of choice, as well as the chosen

hyperparameters of the algorithm. However, their sensitivity to data scales is

largely unknown.

In this work, we apply the discrete wavelet transform to analyze the effects

of coarse-graining on clustering tensor data. We are particularly interested in

understanding how scale effects clustering of the Earth’s climate system. The

discrete wavelet transform allows classification of the Earth’s climate across

a multitude of spatial-temporal scales. The discrete wavelet transform is used

to produce an ensemble of classification estimates, as opposed to a single

classification. Using information theory, we discover a sub-collection of the

ensemble that span the majority of the variance observed, allowing for

efficient consensus clustering techniques that can be used to identify climate

biomes.

Elephant in the Room: An Evaluation Framework for Assessing Adversarial Examples in NLP

Ying Xu , Xu Zhong , Antonio Jose Jimeno Yepes , Jey Han Lau Subjects : Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI)

An adversarial example is an input transformed by small perturbations that

machine learning models consistently misclassify. While there are a number of

methods proposed to generate adversarial examples for text data, it is not

trivial to assess the quality of these adversarial examples, as minor

perturbations (such as changing a word in a sentence) can lead to a significant

shift in their meaning, readability and classification label. In this paper, we

propose an evaluation framework to assess the quality of adversarial examples

based on the aforementioned properties. We experiment with five benchmark

attacking methods and an alternative approach based on an auto-encoder, and

found that these methods generate adversarial examples with poor readability

and content preservation. We also learned that there are multiple factors that

can influence the attacking performance, such as the the length of text

examples and the input domain.

When does the Tukey median work?

Banghua Zhu , Jiantao Jiao , Jacob Steinhardt Subjects : Statistics Theory (math.ST) ; Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Signal Processing (eess.SP); Machine Learning (stat.ML)

We analyze the performance of the Tukey median estimator under total

variation (TV) distance corruptions. Previous results show that under Huber’s

additive corruption model, the breakdown point is 1/3 for high-dimensional

halfspace-symmetric distributions. We show that under TV corruptions, the

breakdown point reduces to 1/4 for the same set of distributions. We also show

that a certain projection algorithm can attain the optimal breakdown point of

1/2. Both the Tukey median estimator and the projection algorithm achieve

sample complexity linear in dimension.

Learning Directed Locomotion in Modular Robots with Evolvable Morphologies

Gongjin Lan , Matteo De Carlo , Fuda van Diggelen , Jakub M. Tomczak , Diederik M. Roijers , A.E. Eiben

Comments: 30 pages, 14 figures

Subjects

:

Neural and Evolutionary Computing (cs.NE)

; Artificial Intelligence (cs.AI)

We generalize the well-studied problem of gait learning in modular robots in

two dimensions. Firstly, we address locomotion in a given target direction that

goes beyond learning a typical undirected gait. Secondly, rather than studying

one fixed robot morphology we consider a test suite of different modular

robots. This study is based on our interest in evolutionary robot systems where

both morphologies and controllers evolve. In such a system, newborn robots have

to learn to control their own body that is a random combination of the bodies

of the parents. We apply and compare two learning algorithms, Bayesian

optimization and HyperNEAT. The results of the experiments in simulation show

that both methods successfully learn good controllers, but Bayesian

optimization is more effective and efficient. We validate the best learned

controllers by constructing three robots from the test suite in the real world

and observe their fitness and actual trajectories. The obtained results

indicate a reality gap that depends on the controllers and the shape of the

robots, but overall the trajectories are adequate and follow the target

directions successfully.

Adaptive Loss Function for Super Resolution Neural Networks Using Convex Optimization Techniques

Seyed Mehdi Ayyoubzadeh , Xiaolin Wu Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)

Single Image Super-Resolution (SISR) task refers to learn a mapping from

low-resolution images to the corresponding high-resolution ones. This task is

known to be extremely difficult since it is an ill-posed problem. Recently,

Convolutional Neural Networks (CNNs) have achieved state of the art performance

on SISR. However, the images produced by CNNs do not contain fine details of

the images. Generative Adversarial Networks (GANs) aim to solve this issue and

recover sharp details. Nevertheless, GANs are notoriously difficult to train.

Besides that, they generate artifacts in the high-resolution images. In this

paper, we have proposed a method in which CNNs try to align images in different

spaces rather than only the pixel space. Such a space is designed using convex

optimization techniques. CNNs are encouraged to learn high-frequency components

of the images as well as low-frequency components. We have shown that the

proposed method can recover fine details of the images and it is stable in the

training process.

Block-wise Scrambled Image Recognition Using Adaptation Network

Koki Madono , Masayuki Tanaka , Masaki Onishi , Tetsuji Ogawa

Comments: 6 pages Artificial Intelligence of Things(AAAI-2020 WS)

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI)

In this study, a perceptually hidden object-recognition method is

investigated to generate secure images recognizable by humans but not machines.

Hence, both the perceptual information hiding and the corresponding object

recognition methods should be developed. Block-wise image scrambling is

introduced to hide perceptual information from a third party. In addition, an

adaptation network is proposed to recognize those scrambled images.

Experimental comparisons conducted using CIFAR datasets demonstrated that the

proposed adaptation network performed well in incorporating simple perceptual

information hiding into DNN-based image classification.

EMOPAIN Challenge 2020: Multimodal Pain Evaluation from Facial and Bodily Expressions

Nadia Berthouze , Michel Valstar , Amanda Williams , Joy Egede , Temitayo Olugbade , Chongyang Wang , Hongyin Meng , Min Aung , Nicholas Lane , Siyang Song

Comments: 8 pages

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

The EmoPain 2020 Challenge is the first international competition aimed at

creating a uniform platform for the comparison of machine learning and

multimedia processing methods of automatic chronic pain assessment from human

expressive behaviour, and also the identification of pain-related behaviours.

The objective of the challenge is to promote research in the development of

assistive technologies that help improve the quality of life for people with

chronic pain via real-time monitoring and feedback to help manage their

condition and remain physically active. The challenge also aims to encourage

the use of the relatively underutilised, albeit vital bodily expression signals

for automatic pain and pain-related emotion recognition. This paper presents a

description of the challenge, competition guidelines, bench-marking dataset,

and the baseline systems’ architecture and performance on the three sub-tasks:

pain estimation from facial expressions, pain recognition from multimodal

movement, and protective movement behaviour detection.

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices

Xiaolong Ma , Wei Niu , Tianyun Zhang , Sijia Liu , Fu-Ming Guo , Sheng Lin , Hongjia Li , Xiang Chen , Jian Tang , Kaisheng Ma , Bin Ren , Yanzhi Wang

Comments: arXiv admin note: text overlap with arXiv:1909.05073

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Weight pruning has been widely acknowledged as a straightforward and

effective method to eliminate redundancy in Deep Neural Networks (DNN), thereby

achieving acceleration on various platforms. However, most of the pruning

techniques are essentially trade-offs between model accuracy and regularity

which lead to impaired inference accuracy and limited on-device acceleration

performance. To solve the problem, we introduce a new sparsity dimension,

namely pattern-based sparsity that comprises pattern and connectivity sparsity,

and becoming both highly accurate and hardware friendly. With carefully

designed patterns, the proposed pruning unprecedentedly and consistently

achieves accuracy enhancement and better feature extraction ability on

different DNN structures and datasets, and our pattern-aware pruning framework

also achieves pattern library extraction, pattern selection, pattern and

connectivity pruning and weight training simultaneously. Our approach on the

new pattern-based sparsity naturally fits into compiler optimization for highly

efficient DNN execution on mobile platforms. To the best of our knowledge, it

is the first time that mobile devices achieve real-time inference for the

large-scale DNN models thanks to the unique spatial property of pattern-based

sparsity and the help of the code generation capability of compilers.

Information Retrieval

Experiments on Manual Thesaurus based Query Expansion for Ad-hoc Monolingual Gujarati Information Retrieval Tasks

Hardik Joshi , Jyoti Pareek

Comments: arXiv admin note: substantial text overlap with arXiv:1209.0126

Subjects

:

Information Retrieval (cs.IR)

In this paper, we present the experimental work done on Query Expansion (QE)

for retrieval tasks of Gujarati text documents. In information retrieval, it is

very difficult to estimate the exact user need, query expansion adds terms to

the original query, which provides more information about the user need. There

are various approaches to query expansion. In our work, manual thesaurus based

query expansion was performed to evaluate the performance of widely used

information retrieval models for Gujarati text documents. Results show that

query expansion improves the recall of text documents.

Emotion and Sentiment Lexicon Impact on Sentiment Analysis Applied to Book Reviews

Patrice Bellot (R2I, LIS), Lerch Soëlie (R2I, DIAMS), Bruno Emmanuel (DIAMS), Murisasco Elisabeth (DIAMS)

Comments: in French

Journal-ref: COnf{‘e}rence en Recherche d’Informations et Applications – CORIA

2019, 16th French Information Retrieval Conference, Mar 2019, Lyon, France

Subjects

:

Information Retrieval (cs.IR)

; Social and Information Networks (cs.SI)

Consumers are used to consulting posted reviews on the Internet before buying

a product. But it’s difficult to know the global opinion considering the

important number of those reviews. Sentiment analysis afford detecting polarity

(positive, negative, neutral) in a expressed opinion and therefore classifying

those reviews. Our purpose is to determine the influence of emotions on the

polarity of books reviews. We define “bag-of-words” representation models of

reviews which use a lexicon containing emotional (anticipation, sadness, fear,

anger, joy, surprise, trust, disgust) and sentimental (positive, negative)

words. This lexicon afford measuring felt emotions types by readers. The

implemented supervised learning used is a Random Forest type. The application

concerns Amazon platform’s reviews. Mots-cl{é}s : Analyse de sentiments,

Analyse d'{é}motions (texte), Classification de polarit{é} de sentiments

ARAACOM: ARAbic Algerian Corpus for Opinion Mining

Zitouni Abdelhafid (LIRE), Hichem Rahab (ICOSI, LIRE), Abdelhafid Zitouni (LIRE), Mahieddine Djoudi (TECHNÉ – EA 6316)

Journal-ref: ICCES ’17: Proceedings of the International Conference on

Computing for Engineering and Sciences, Jul 2017, Istanbul, France. pp.35-39

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

Nowadays, it is no more needed to do an enormous effort to distribute a lot

of forms to thousands of people and collect them, then convert this from into

electronic format to track people opinion about some subjects. A lot of web

sites can today reach a large spectrum with less effort. The majority of web

sites suggest to their visitors to leave backups about their feeling of the

site or events. So, this makes for us a lot of data which need powerful mean to

exploit. Opinion mining in the web becomes more and more an attracting task,

due the increasing need for individuals and societies to track the mood of

people against several subjects of daily life (sports, politics,

television,…). A lot of works in opinion mining was developed in western

languages especially English, such works in Arabic language still very scarce.

In this paper, we propose our approach, for opinion mining in Arabic Algerian

news paper. CCS CONCEPTS (ullet)Information systems~Sentiment analysis

(ullet) Computing methodologies~Natural language processing

Graph Generators: State of the Art and Open Challenges

Angela Bonifati , Irena Holubová , Arnau Prat-Pérez , Sherif Sakr

Comments: ACM Computing Surveys, 32 pages

Subjects

:

Databases (cs.DB)

; Information Retrieval (cs.IR); Social and Information Networks (cs.SI)

The abundance of interconnected data has fueled the design and implementation

of graph generators reproducing real-world linking properties, or gauging the

effectiveness of graph algorithms, techniques and applications manipulating

these data. We consider graph generation across multiple subfields, such as

Semantic Web, graph databases, social networks, and community detection, along

with general graphs. Despite the disparate requirements of modern graph

generators throughout these communities, we analyze them under a common

umbrella, reaching out the functionalities, the practical usage, and their

supported operations. We argue that this classification is serving the need of

providing scientists, researchers and practitioners with the right data

generator at hand for their work. This survey provides a comprehensive overview

of the state-of-the-art graph generators by focusing on those that are

pertinent and suitable for several data-intensive tasks. Finally, we discuss

open challenges and missing requirements of current graph generators along with

their future extensions to new emerging fields.

VoiceCoach: Interactive Evidence-based Training for Voice Modulation Skills in Public Speaking

Xingbo Wang , Haipeng Zeng , Yong Wang , Aoyu Wu , Zhida Sun , Xiaojuan Ma , Huamin Qu

Comments: Accepted by CHI ’20

Subjects

:

Human-Computer Interaction (cs.HC)

; Computation and Language (cs.CL); Information Retrieval (cs.IR)

The modulation of voice properties, such as pitch, volume, and speed, is

crucial for delivering a successful public speech. However, it is challenging

to master different voice modulation skills. Though many guidelines are

available, they are often not practical enough to be applied in different

public speaking situations, especially for novice speakers. We present

VoiceCoach, an interactive evidence-based approach to facilitate the effective

training of voice modulation skills. Specifically, we have analyzed the voice

modulation skills from 2623 high-quality speeches (i.e., TED Talks) and use

them as the benchmark dataset. Given a voice input, VoiceCoach automatically

recommends good voice modulation examples from the dataset based on the

similarity of both sentence structures and voice modulation skills. Immediate

and quantitative visual feedback is provided to guide further improvement. The

expert interviews and the user study provide support for the effectiveness and

usability of VoiceCoach.

Keyword-based Topic Modeling and Keyword Selection

Xingyu Wang , Lida Zhang , Diego Klabjan Subjects : Machine Learning (stat.ML) ; Information Retrieval (cs.IR); Machine Learning (cs.LG)

Certain type of documents such as tweets are collected by specifying a set of

keywords. As topics of interest change with time it is beneficial to adjust

keywords dynamically. The challenge is that these need to be specified ahead of

knowing the forthcoming documents and the underlying topics. The future topics

should mimic past topics of interest yet there should be some novelty in them.

We develop a keyword-based topic model that dynamically selects a subset of

keywords to be used to collect future documents. The generative process first

selects keywords and then the underlying documents based on the specified

keywords. The model is trained by using a variational lower bound and

stochastic gradient optimization. The inference consists of finding a subset of

keywords where given a subset the model predicts the underlying topic-word

matrix for the unknown forthcoming documents. We compare the keyword topic

model against a benchmark model using viral predictions of tweets combined with

a topic model. The keyword-based topic model outperforms this sophisticated

baseline model by 67%.

Optimal estimation of sparse topic models

Xin Bing , Florentina Bunea , Marten Wegkamp Subjects : Machine Learning (stat.ML) ; Information Retrieval (cs.IR); Machine Learning (cs.LG)

Topic models have become popular tools for dimension reduction and

exploratory analysis of text data which consists in observed frequencies of a

vocabulary of (p) words in (n) documents, stored in a (p imes n) matrix. The

main premise is that the mean of this data matrix can be factorized into a

product of two non-negative matrices: a (p imes K) word-topic matrix (A) and a

(K imes n) topic-document matrix (W). This paper studies the estimation of (A)

that is possibly element-wise sparse, and the number of topics (K) is unknown.

In this under-explored context, we derive a new minimax lower bound for the

estimation of such (A) and propose a new computationally efficient algorithm

for its recovery. We derive a finite sample upper bound for our estimator, and

show that it matches the minimax lower bound in many scenarios. Our estimate

adapts to the unknown sparsity of (A) and our analysis is valid for any finite

(n), (p), (K) and document lengths. Empirical results on both synthetic data

and semi-synthetic data show that our proposed estimator is a strong competitor

of the existing state-of-the-art algorithms for both non-sparse (A) and sparse

(A), and has superior performance is many scenarios of interest.

Incentivising Exploration and Recommendations for Contextual Bandits with Payments

Priyank Agrawal , Theja Tulabandhula

Comments: 11 pages, 4 figures

Subjects

:

Machine Learning (cs.LG)

; Information Retrieval (cs.IR); Machine Learning (stat.ML)

We propose a contextual bandit based model to capture the learning and social

welfare goals of a web platform in the presence of myopic users. By using

payments to incentivize these agents to explore different

items/recommendations, we show how the platform can learn the inherent

attributes of items and achieve a sublinear regret while maximizing cumulative

social welfare. We also calculate theoretical bounds on the cumulative costs of

incentivization to the platform. Unlike previous works in this domain, we

consider contexts to be completely adversarial, and the behavior of the

adversary is unknown to the platform. Our approach can improve various

engagement metrics of users on e-commerce stores, recommendation engines and

matching platforms.

A Price-Per-Attention Auction Scheme Using Mouse Cursor Information

Ioannis Arapakis , Antonio Penta , Hideo Joho , Luis A. Leiva

Journal-ref: ACM Trans. Inf. Syst. 38, 2 (2020)

Subjects

:

Computer Science and Game Theory (cs.GT)

; Information Retrieval (cs.IR)

Payments in online ad auctions are typically derived from click-through

rates, so that advertisers do not pay for ineffective ads. But advertisers

often care about more than just clicks. That is, for example, if they aim to

raise brand awareness or visibility. There is thus an opportunity to devise a

more effective ad pricing paradigm, in which ads are paid only if they are

actually noticed. This article contributes a novel auction format based on a

pay-per-attention (PPA) scheme. We show that the PPA auction inherits the

desirable properties (strategy-proofness and efficiency) as its

pay-per-impression and pay-per-click counterparts, and that it also compares

favourably in terms of revenues. To make the PPA format feasible, we also

contribute a scalable diagnostic technology to predict user attention to ads in

sponsored search using raw mouse cursor coordinates only, regardless of the

page content and structure. We use the user attention predictions in numerical

simulations to evaluate the PPA auction scheme. Our results show that, in

relevant economic settings, the PPA revenues would be strictly higher than the

existing auction payment schemes.

Computation and Language

Multilingual Denoising Pre-training for Neural Machine Translation

Yinhan Liu , Jiatao Gu , Naman Goyal , Xian Li , Sergey Edunov , Marjan Ghazvininejad , Mike Lewis , Luke Zettlemoyer

Comments: Work in progress

Subjects

:

Computation and Language (cs.CL)

This paper demonstrates that multilingual denoising pre-training produces

significant performance gains across a wide variety of machine translation (MT)

tasks. We present mBART — a sequence-to-sequence denoising auto-encoder

pre-trained on large-scale monolingual corpora in many languages using the BART

objective. mBART is the first method for pre-training a complete

sequence-to-sequence model by denoising full texts in multiple languages;

previous MT pre-training has focused only on the encoder, decoder, or

reconstructing parts of the text. Pre-training a complete model allows it to be

directly fine tuned for supervised (both sentence-level and document-level) and

unsupervised machine translation, with no task-specific modifications. We

demonstrate that adding mBART initialization produces performance gains in all

but the highest-resource settings, including up to 12 BLEU points for low

resource MT and over 5 BLEU points for many document-level and unsupervised

models. We also show it also enables new types of transfer to language pairs

with no bi-text or that were not in the pre-training corpus, and present

extensive analysis of which factors contribute the most to effective

pre-training.

Unsupervised Domain Adaptation for Neural Machine Translation with Iterative Back Translation

Di Jin , Zhijing Jin , Joey Tianyi Zhou , Peter Szolovits

Comments: Submitted to IJCAI 2020

Subjects

:

Computation and Language (cs.CL)

; Machine Learning (cs.LG)

State-of-the-art neural machine translation (NMT) systems are data-hungry and

perform poorly on domains with little supervised data. As data collection is

expensive and infeasible in many cases, unsupervised domain adaptation methods

are needed. We apply an Iterative Back Translation (IBT) training scheme on

in-domain monolingual data, which repeatedly uses a Transformer-based NMT model

to create in-domain pseudo-parallel sentence pairs in one translation direction

on the fly and then use them to train the model in the other direction.

Evaluated on three domains of German-to-English translation task with no

supervised data, this simple technique alone (without any out-of-domain

parallel data) can already surpass all previous domain adaptation methods—up

to +9.48 BLEU over the strongest previous method, and up to +27.77 BLEU over

the unadapted baseline. Moreover, given available supervised out-of-domain data

on German-to-English and Romanian-to-English language pairs, we can further

enhance the performance and obtain up to +19.31 BLEU improvement over the

strongest baseline, and +47.69 BLEU increment against the unadapted model.

Contextualized Embeddings in Named-Entity Recognition: An Empirical Study on Generalization

Bruno Taillé , Vincent Guigue , Patrick Gallinari

Journal-ref: ECIR 2020

Subjects

:

Computation and Language (cs.CL)

; Machine Learning (cs.LG)

Contextualized embeddings use unsupervised language model pretraining to

compute word representations depending on their context. This is intuitively

useful for generalization, especially in Named-Entity Recognition where it is

crucial to detect mentions never seen during training. However, standard

English benchmarks overestimate the importance of lexical over contextual

features because of an unrealistic lexical overlap between train and test

mentions. In this paper, we perform an empirical analysis of the generalization

capabilities of state-of-the-art contextualized embeddings by separating

mentions by novelty and with out-of-domain evaluation. We show that they are

particularly beneficial for unseen mentions detection, especially

out-of-domain. For models trained on CoNLL03, language model contextualization

leads to a +1.2% maximal relative micro-F1 score increase in-domain against

+13% out-of-domain on the WNUT dataset

TLT-school: a Corpus of Non Native Children Speech

Roberto Gretter , Marco Matassoni , Stefano Bannò , Daniele Falavigna Subjects : Computation and Language (cs.CL)

This paper describes “TLT-school” a corpus of speech utterances collected in

schools of northern Italy for assessing the performance of students learning

both English and German. The corpus was recorded in the years 2017 and 2018

from students aged between nine and sixteen years, attending primary, middle

and high school. All utterances have been scored, in terms of some predefined

proficiency indicators, by human experts. In addition, most of utterances

recorded in 2017 have been manually transcribed carefully. Guidelines and

procedures used for manual transcriptions of utterances will be described in

detail, as well as results achieved by means of an automatic speech recognition

system developed by us. Part of the corpus is going to be freely distributed to

scientific community particularly interested both in non-native speech

recognition and automatic assessment of second language proficiency.

ManyModalQA: Modality Disambiguation and QA over Diverse Inputs

Darryl Hannan , Akshay Jain , Mohit Bansal

Comments: AAAI 2020 (10 pages)

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

We present a new multimodal question answering challenge, ManyModalQA, in

which an agent must answer a question by considering three distinct modalities:

text, images, and tables. We collect our data by scraping Wikipedia and then

utilize crowdsourcing to collect question-answer pairs. Our questions are

ambiguous, in that the modality that contains the answer is not easily

determined based solely upon the question. To demonstrate this ambiguity, we

construct a modality selector (or disambiguator) network, and this model gets

substantially lower accuracy on our challenge set, compared to existing

datasets, indicating that our questions are more ambiguous. By analyzing this

model, we investigate which words in the question are indicative of the

modality. Next, we construct a simple baseline ManyModalQA model, which, based

on the prediction from the modality selector, fires a corresponding pre-trained

state-of-the-art unimodal QA model. We focus on providing the community with a

new manymodal evaluation set and only provide a fine-tuning set, with the

expectation that existing datasets and approaches will be transferred for most

of the training, to encourage low-resource generalization without large,

monolithic training sets for each new task. There is a significant gap between

our baseline models and human performance; therefore, we hope that this

challenge encourages research in end-to-end modality disambiguation and

multimodal QA models, as well as transfer learning. Code and data available at:

this https URL

ARAACOM: ARAbic Algerian Corpus for Opinion Mining

Zitouni Abdelhafid (LIRE), Hichem Rahab (ICOSI, LIRE), Abdelhafid Zitouni (LIRE), Mahieddine Djoudi (TECHNÉ – EA 6316)

Journal-ref: ICCES ’17: Proceedings of the International Conference on

Computing for Engineering and Sciences, Jul 2017, Istanbul, France. pp.35-39

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

Nowadays, it is no more needed to do an enormous effort to distribute a lot

of forms to thousands of people and collect them, then convert this from into

electronic format to track people opinion about some subjects. A lot of web

sites can today reach a large spectrum with less effort. The majority of web

sites suggest to their visitors to leave backups about their feeling of the

site or events. So, this makes for us a lot of data which need powerful mean to

exploit. Opinion mining in the web becomes more and more an attracting task,

due the increasing need for individuals and societies to track the mood of

people against several subjects of daily life (sports, politics,

television,…). A lot of works in opinion mining was developed in western

languages especially English, such works in Arabic language still very scarce.

In this paper, we propose our approach, for opinion mining in Arabic Algerian

news paper. CCS CONCEPTS (ullet)Information systems~Sentiment analysis

(ullet) Computing methodologies~Natural language processing

Normalization of Input-output Shared Embeddings in Text Generation Models

Jinyang Liu , Yujia Zhai , Zizhong Chen Subjects : Computation and Language (cs.CL) ; Machine Learning (cs.LG)

Neural Network based models have been state-of-the-art models for various

Natural Language Processing tasks, however, the input and output dimension

problem in the networks has still not been fully resolved, especially in text

generation tasks (e.g. Machine Translation, Text Summarization), in which input

and output both have huge sizes of vocabularies. Therefore, input-output

embedding weight sharing has been introduced and adopted widely, which remains

to be improved. Based on linear algebra and statistical theories, this paper

locates the shortcoming of existed input-output embedding weight sharing

method, then raises methods for improving input-output weight shared embedding,

among which methods of normalization of embedding weight matrices show best

performance. These methods are nearly computational cost-free, can get combined

with other embedding techniques, and show good effectiveness when applied on

state-of-the-art Neural Network models. For Transformer-big models, the

normalization techniques can get at best 0.6 BLEU improvement compared to the

original version of model on WMT’16 En-De dataset, and similar BLEU

improvements on IWSLT 14′ datasets. For DynamicConv models, 0.5 BLEU

improvement can be attained on WMT’16 En-De dataset, and 0.41 BLEU improvement

on IWSLT 14′ De-En translation task is achieved.

Elephant in the Room: An Evaluation Framework for Assessing Adversarial Examples in NLP

Ying Xu , Xu Zhong , Antonio Jose Jimeno Yepes , Jey Han Lau Subjects : Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI)

An adversarial example is an input transformed by small perturbations that

machine learning models consistently misclassify. While there are a number of

methods proposed to generate adversarial examples for text data, it is not

trivial to assess the quality of these adversarial examples, as minor

perturbations (such as changing a word in a sentence) can lead to a significant

shift in their meaning, readability and classification label. In this paper, we

propose an evaluation framework to assess the quality of adversarial examples

based on the aforementioned properties. We experiment with five benchmark

attacking methods and an alternative approach based on an auto-encoder, and

found that these methods generate adversarial examples with poor readability

and content preservation. We also learned that there are multiple factors that

can influence the attacking performance, such as the the length of text

examples and the input domain.

Shared Task: Lexical Semantic Change Detection in German

Adnan Ahmad , Kiflom Desta , Fabian Lang , Dominik Schlechtweg Subjects : Computation and Language (cs.CL)

Recent NLP architectures have illustrated in various ways how semantic change

can be captured across time and domains. However, in terms of evaluation there

is a lack of benchmarks to compare the performance of these systems against

each other. We present the results of the first shared task on unsupervised

lexical semantic change detection (LSCD) in German based on the evaluation

framework proposed by Schlechtweg et al. (2019).

Where New Words Are Born: Distributional Semantic Analysis of Neologisms and Their Semantic Neighborhoods

Maria Ryskina , Ella Rabinovich , Taylor Berg-Kirkpatrick , David R. Mortensen , Yulia Tsvetkov

Comments: SCiL 2020

Journal-ref: Proceedings of the Society for Computation in Linguistics 3.1

(2020): 43-52

Subjects

:

Computation and Language (cs.CL)

We perform statistical analysis of the phenomenon of neology, the process by

which new words emerge in a language, using large diachronic corpora of

English. We investigate the importance of two factors, semantic sparsity and

frequency growth rates of semantic neighbors, formalized in the distributional

semantics paradigm. We show that both factors are predictive of word emergence

although we find more support for the latter hypothesis. Besides presenting a

new linguistic application of distributional semantics, this study tackles the

linguistic question of the role of language-internal factors (in our case,

sparsity) in language change motivated by language-external factors (reflected

in frequency growth).

VoiceCoach: Interactive Evidence-based Training for Voice Modulation Skills in Public Speaking

Xingbo Wang , Haipeng Zeng , Yong Wang , Aoyu Wu , Zhida Sun , Xiaojuan Ma , Huamin Qu

Comments: Accepted by CHI ’20

Subjects

:

Human-Computer Interaction (cs.HC)

; Computation and Language (cs.CL); Information Retrieval (cs.IR)

The modulation of voice properties, such as pitch, volume, and speed, is

crucial for delivering a successful public speech. However, it is challenging

to master different voice modulation skills. Though many guidelines are

available, they are often not practical enough to be applied in different

public speaking situations, especially for novice speakers. We present

VoiceCoach, an interactive evidence-based approach to facilitate the effective

training of voice modulation skills. Specifically, we have analyzed the voice

modulation skills from 2623 high-quality speeches (i.e., TED Talks) and use

them as the benchmark dataset. Given a voice input, VoiceCoach automatically

recommends good voice modulation examples from the dataset based on the

similarity of both sentence structures and voice modulation skills. Immediate

and quantitative visual feedback is provided to guide further improvement. The

expert interviews and the user study provide support for the effectiveness and

usability of VoiceCoach.

Unsupervised Representation Disentanglement using Cross Domain Features and Adversarial Learning in Variational Autoencoder based Voice Conversion

Wen-Chin Huang , Hao Luo , Hsin-Te Hwang , Chen-Chou Lo , Yu-Huai Peng , Yu Tsao , Hsin-Min Wang

Comments: Accepted to IEEE Transactions on Emerging Topics in Computational Intelligence

Subjects

:

Audio and Speech Processing (eess.AS)

; Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)

An effective approach for voice conversion (VC) is to disentangle linguistic

content from other components in the speech signal. The effectiveness of

variational autoencoder (VAE) based VC (VAE-VC), for instance, strongly relies

on this principle. In our prior work, we proposed a cross-domain VAE-VC

(CDVAE-VC) framework, which utilized acoustic features of different properties,

to improve the performance of VAE-VC. We believed that the success came from

more disentangled latent representations. In this paper, we extend the CDVAE-VC

framework by incorporating the concept of adversarial learning, in order to

further increase the degree of disentanglement, thereby improving the quality

and similarity of converted speech. More specifically, we first investigate the

effectiveness of incorporating the generative adversarial networks (GANs) with

CDVAE-VC. Then, we consider the concept of domain adversarial training and add

an explicit constraint to the latent representation, realized by a speaker

classifier, to explicitly eliminate the speaker information that resides in the

latent code. Experimental results confirm that the degree of disentanglement of

the learned latent representation can be enhanced by both GANs and the speaker

classifier. Meanwhile, subjective evaluation results in terms of quality and

similarity scores demonstrate the effectiveness of our proposed methods.

Emergence of Pragmatics from Referential Game between Theory of Mind Agents

Luyao Yuan , Zipeng Fu , Jingyue Shen , Lu Xu , Junhong Shen , Song-Chun Zhu Subjects : Artificial Intelligence (cs.AI) ; Computation and Language (cs.CL); Machine Learning (cs.LG); Multiagent Systems (cs.MA)

Pragmatics studies how context can contribute to language meanings [1]. In

human communication, language is never interpreted out of context, and

sentences can usually convey more information than their literal meanings [2].

However, this mechanism is missing in most multi-agent systems [3, 4, 5, 6],

restricting the communication efficiency and the capability of human-agent

interaction. In this paper, we propose an algorithm, using which agents can

spontaneously learn the ability to “read between lines” without any explicit

hand-designed rules. We integrate the theory of mind (ToM) [7, 8] in a

cooperative multi-agent pedagogical situation and propose an adaptive

reinforcement learning (RL) algorithm to develop a communication protocol. ToM

is a profound cognitive science concept, claiming that people regularly reason

about other’s mental states, including beliefs, goals, and intentions, to

obtain performance advantage in competition, cooperation or coalition. With

this ability, agents consider language as not only messages but also rational

acts reflecting others’ hidden states. Our experiments demonstrate the

advantage of pragmatic protocols over non-pragmatic protocols. We also show the

teaching complexity following the pragmatic protocol empirically approximates

to recursive teaching dimension (RTD).

Distributed, Parallel, and Cluster Computing

Tuneful: An Online Significance-Aware Configuration Tuner for Big Data Analytics

Ayat Fekry , Lucian Carata , Thomas Pasquier , Andrew Rice , Andy Hopper Subjects : Distributed, Parallel, and Cluster Computing (cs.DC) ; Systems and Control (eess.SY)

Distributed analytics engines such as Spark are a common choice for

processing extremely large datasets. However, finding good configurations for

these systems remains challenging, with each workload potentially requiring a

different setup to run optimally. Using suboptimal configurations incurs

significant extra runtime costs. %Furthermore, Spark and similar platforms are

gaining traction within data-scientists communities where awareness of such

issues is relatively low.

We propose Tuneful, an approach that efficiently tunes the configuration of

in-memory cluster computing systems. Tuneful combines incremental Sensitivity

Analysis and Bayesian optimization to identify near-optimal configurations from

a high-dimensional search space, using a small number of executions. This setup

allows the tuning to be done online, without any previous training. Our

experimental results show that Tuneful reduces the search time for finding

close-to-optimal configurations by 62\% (at the median) when compared to

existing state-of-the-art techniques. This means that the amortization of the

tuning cost happens significantly faster, enabling practical tuning for new

classes of workloads.

A Simple and Efficient Binary Byzantine Consensus Algorithm using Cryptography and Partial Synchrony

Tyler Crain Subjects : Distributed, Parallel, and Cluster Computing (cs.DC)

This paper describes a simple and efficient Binary Byzantine faulty tolerant

consensus algorithm using a weak round coordinator and the partial synchrony

assumption to ensure liveness. In the algorithm, non-faulty nodes perform an

initial broadcast followed by a executing a series of rounds consisting of a

single message broadcast until termination. Each message is accompanied by a

cryptographic proof of its validity. In odd rounds the binary value 1 can be

decided, in even round 0. Up to one third of the nodes can be faulty and

termination is ensured within a number of round of a constant factor of the

number of faults. Experiments show termination can be reached in less than 200

milliseconds with 300 Amazon EC2 instances spread across 5 continents even with

partial initial disagreement.

Fine-grained Analysis on Fast Implementations of Multi-writer Atomic Registers

Kaile Huang , Yu Huang , Hengfeng Wei

Comments: v0.1, only contains the impossibility proof

Subjects

:

Distributed, Parallel, and Cluster Computing (cs.DC)

This draft in its current version proves an impossibility result concerning

fast implementations of multi-writer distributed atomic registers. This is the

first step of our work toward completing the exploration of fast

implementations of distributed atomic registers. The plan of our work is

outlined in Section 1. The missing sections will be provided soon.

Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided

Robert Gerstenberger , Maciej Besta , Torsten Hoefler

Comments: 12 pages, 8 figures; Best Student Paper finalist (8/92) and winner of the SC’13 Best Paper Award (1/92); source code of foMPI can be downloaded from this http URL

Journal-ref: Proceedings of the International Conference on High Performance

Computing, Networking, Storage and Analysis, pages 53:1–53:12, November 2013

Subjects

:

Distributed, Parallel, and Cluster Computing (cs.DC)

; Performance (cs.PF)

Modern interconnects offer remote direct memory access (RDMA) features. Yet,

most applications rely on explicit message passing for communications albeit

their unwanted overheads. The MPI-3.0 standard defines a programming interface

for exploiting RDMA networks directly, however, it’s scalability and

practicability has to be demonstrated in practice. In this work, we develop

scalable bufferless protocols that implement the MPI-3.0 specification. Our

protocols support scaling to millions of cores with negligible memory

consumption while providing highest performance and minimal overheads. To arm

programmers, we provide a spectrum of performance models for all critical

functions and demonstrate the usability of our library and models with several

application studies with up to half a million processes. We show that our

design is comparable to, or better than UPC and Fortran Coarrays in terms of

latency, bandwidth, and message rate. We also demonstrate application

performance improvements with comparable programming complexity.

Properties of the Tangle for Uniform Random and Random Walk Tip Selection

Bartosz Kusmierz , William Sanders , Andreas Penzkofer , Angelo Capossele , Alon Gal

Comments: Published in: 2019 IEEE International Conference on Blockchain (Blockchain)

Journal-ref: 2019 IEEE International Conference on Blockchain (Blockchain)

Subjects

:

Distributed, Parallel, and Cluster Computing (cs.DC)

The growing number of applications for distributed ledger technologies is

driving both industry and academia to solve the limitations of blockchain,

particularly its scalability issues. Recent distributed ledger technologies

have replaced the blockchain linear structure with a more flexible directed

acyclic graph in an attempt to accommodate a higher throughput. Despite the

fast-growing diffusion of directed acyclic graph based distributed ledger

technologies, researchers lack a basic understanding of their behavior. In this

paper we analyze the Tangle, a directed acyclic graph that is used (with

certain modifications) in various protocols such as IOTA, Byteball, Avalanche

or SPECTRE. Our contribution is threefold. First, we run simulations in a

continuous-time model to examine tip count stability and cumulative weight

evolution while varying the rate of incoming transactions. In particular we

confirm analytical predictions on the number of tips with uniform random tip

selection strategy. Second, we show how different tip selection algorithms

affect the growth of the Tangle. Moreover, we explain these differences by

analyzing the spread of exit probabilities of random walks. Our findings

confirm analytically derived predictions and provide novel insights on the

different phases of growth of cumulative weight as well as on the average time

difference for a transaction to receive its first approval when using distinct

tip selection algorithms. Lastly, we analyze simulation overhead and

performance as a function of Tangle size and compare results for different tip

selection algorithms.

Towards Digital Twins for the Description of Automotive Software Systems

Jan Olaf Blech (Aalto University)

Comments: In Proceedings QAPL 2019, arXiv:2001.06163

Journal-ref: EPTCS 312, 2020, pp. 20-28

Subjects

:

Distributed, Parallel, and Cluster Computing (cs.DC)

; Software Engineering (cs.SE)

We present models for automotive software that capture quantitative and

qualitative aspects of software systems and the underlying hardware

architecture. In particular, we consider different levels of computing power.

These range from controllers up to the cloud. We present a modeling approach

for software deployment taking different automotive requirements such as

criticality, latency, memory, computational resources, and communication into

account. Our models capture automotive software and hardware system

configurations and can serve as digital twins that are digital counterparts of

(usually) physical entities. Furthermore, we highlight connected research areas

and challenges.

Asynchronous Consensus Algorithm

Maxim Zakharov Subjects : Distributed, Parallel, and Cluster Computing (cs.DC)

This document describes a new consensus algorithm which is asynchronous and

uses gossip based message dissemination between nodes. The current version of

the algorithm does not cover the case of a node failure or significantly

delayed response. This is the subject of further research of the algorithm. An

outline of a new design for trust-less payment system is given in appendices.

Anchoring the value of Cryptocurrency

Yibin Xu , Yangyu Huang , Jianhua Shao

Journal-ref: 3rd International Workshop on Blockchain Oriented Software

Engineering. Western University. London, Canada, February 18, 2020

Subjects

:

Cryptography and Security (cs.CR)

; Distributed, Parallel, and Cluster Computing (cs.DC)

A decade long thrive of cryptocurrency has shown its potential as a source of

alternative-finance and the security and the robustness of the underpinning

blockchain technology.

However, most cryptocurrencies fail to show inimitability and their meanings

in the real world. As a result, they usually start off as favourites but

quickly become the outcasts of the digital asset market.

The blockchain society attempts to anchor the value of cryptocurrency with

real values by employing smart contracts and link it with computation resources

and the digital-productivity that have value and demands in the real world. But

their attempts have some undesirable effects due to a limited number of

practical applications. This limitation is caused by the dilemma between high

performance and decentralisation (universal joinability). The emerging of

blockchain sharding models, however, has offered a possible solution to address

this dilemma.

In this paper, we explore a financial model for blockchain sharding that will

build an active link between the value of cryptocurrency and computation

resources as well as the market and labour behaviours. Our model can adjust the

price of resources and the compensation for maintaining a system based on those

behaviours. We anchor the value of cryptocurrency by the amount of computation

resources participated in and give the cryptocurrency a meaning as the exchange

between computation resources globally. Finally, we present a working example

which, through financial regularities, regulates the behaviour of anonymous

participants, also incents/discourages participation dynamically.

Simple and Fast Distributed Computation of Betweenness Centrality

Pierluigi Crescenzi , Pierre Fraigniaud , Ami Paz Subjects : Social and Information Networks (cs.SI) ; Distributed, Parallel, and Cluster Computing (cs.DC)

Betweenness centrality is a graph parameter that has been successfully

applied to network analysis. In the context of computer networks, it was

considered for various objectives, ranging from routing to service placement.

However, as observed by Maccari et al. [INFOCOM 2018], research on betweenness

centrality for improving protocols was hampered by the lack of a usable, fully

distributed algorithm for computing this parameter. We resolve this issue by

designing an efficient algorithm for computing betweenness centrality, which

can be implemented by minimal modifications to any distance-vector routing

protocol based on Bellman-Ford. The convergence time of our implementation is

shown to be proportional to the diameter of the network

Accelerating supply chains with Ant Colony Optimization across range of hardware solutions

Ivars Dzalbs , Tatiana Kalganova Subjects : Artificial Intelligence (cs.AI) ; Distributed, Parallel, and Cluster Computing (cs.DC); Neural and Evolutionary Computing (cs.NE)

Ant Colony algorithm has been applied to various optimization problems,

however most of the previous work on scaling and parallelism focuses on

Travelling Salesman Problems (TSPs). Although, useful for benchmarks and new

idea comparison, the algorithmic dynamics does not always transfer to complex

real-life problems, where additional meta-data is required during solution

construction. This paper looks at real-life outbound supply chain problem using

Ant Colony Optimization (ACO) and its scaling dynamics with two parallel ACO

architectures – Independent Ant Colonies (IAC) and Parallel Ants (PA). Results

showed that PA was able to reach a higher solution quality in fewer iterations

as the number of parallel instances increased. Furthermore, speed performance

was measured across three different hardware solutions – 16 core CPU, 68 core

Xeon Phi and up to 4 Geforce GPUs. State of the art, ACO vectorization

techniques such as SS-Roulette were implemented using C++ and CUDA. Although

excellent for TSP, it was concluded that for the given supply chain problem

GPUs are not suitable due to meta-data access footprint required. Furthermore,

compared to their sequential counterpart, vectorized CPU AVX2 implementation

achieved 25.4x speedup on CPU while Xeon Phi with its AVX512 instruction set

reached 148x on PA with Vectorized (PAwV). PAwV is therefore able to scale at

least up to 1024 parallel instances on the supply chain network problem solved.

An authentication protocol based on chaos and zero knowledge proof

Will Major , William J Buchanan , Jawad Ahmad

Journal-ref: Major, W., Buchanan, W.J. & Ahmad, J. Nonlinear Dyn (2020).

https://doi.org/10.1007/s11071-020-05463-3

Subjects

:

Cryptography and Security (cs.CR)

; Distributed, Parallel, and Cluster Computing (cs.DC)

Port Knocking is a method for authenticating clients through a closed stance

firewall, and authorising their requested actions, enabling severs to offer

services to authenticated clients, without opening ports on the firewall.

Advances in port knocking have resulted in an increase in complexity in design,

preventing port knocking solutions from realising their potential. This paper

proposes a novel port knocking solution, named Crucible, which is a secure

method of authentication, with high usability and features of stealth, allowing

servers and services to remain hidden and protected. Crucible is a stateless

solution, only requiring the client memorise a command, the server’s IP and a

chosen password. The solution is forwarded as a method for protecting servers

against attacks ranging from port scans, to zero-day exploitation. To act as a

random oracle for both client and server, cryptographic hashes were generated

through chaotic systems.

Adaptive Large Neighborhood Search for Circle Bin Packing Problem

Kun He , Kevin Tole , Fei Ni , Yong Yuan , Linyun Liao

Comments: 13 pages, 6 figures, 6 tables

Subjects

:

Artificial Intelligence (cs.AI)

; Distributed, Parallel, and Cluster Computing (cs.DC)

We address a new variant of packing problem called the circle bin packing

problem (CBPP), which is to find a dense packing of circle items to multiple

square bins so as to minimize the number of used bins. To this end, we propose

an adaptive large neighborhood search (ALNS) algorithm, which uses our Greedy

Algorithm with Corner Occupying Action (GACOA) to construct an initial layout.

The greedy solution is usually in a local optimum trap, and ALNS enables

multiple neighborhood search that depends on the stochastic annealing schedule

to avoid getting stuck in local minimum traps. Specifically, ALNS perturbs the

current layout to jump out of a local optimum by iteratively reassigns some

circles and accepts the new layout with some probability during the search. The

acceptance probability is adjusted adaptively using simulated annealing that

fine-tunes the search direction in order to reach the global optimum. We

benchmark computational results against GACOA in heterogeneous instances. ALNS

always outperforms GACOA in improving the objective function, and in several

cases, there is a significant reduction on the number of bins used in the

packing.

Learning

GraphGen: A Scalable Approach to Domain-agnostic Labeled Graph Generation

Nikhil Goyal , Harsh Vardhan Jain , Sayan Ranu

Comments: The Web Conference (WWW) 2020

Subjects

:

Machine Learning (cs.LG)

; Machine Learning (stat.ML)

Graph generative models have been extensively studied in the data mining

literature. While traditional techniques are based on generating structures

that adhere to a pre-decided distribution, recent techniques have shifted

towards learning this distribution directly from the data. While learning-based

approaches have imparted significant improvement in quality, some limitations

remain to be addressed. First, learning graph distributions introduces

additional computational overhead, which limits their scalability to large

graph databases. Second, many techniques only learn the structure and do not

address the need to also learn node and edge labels, which encode important

semantic information and influence the structure itself. Third, existing

techniques often incorporate domain-specific rules and lack generalizability.

Fourth, the experimentation of existing techniques is not comprehensive enough

due to either using weak evaluation metrics or focusing primarily on synthetic

or small datasets. In this work, we develop a domain-agnostic technique called

GraphGen to overcome all of these limitations. GraphGen converts graphs to

sequences using minimum DFS codes. Minimum DFS codes are canonical labels and

capture the graph structure precisely along with the label information. The

complex joint distributions between structure and semantic labels are learned

through a novel LSTM architecture. Extensive experiments on million-sized, real

graph datasets show GraphGen to be 4 times faster on average than

state-of-the-art techniques while being significantly better in quality across

a comprehensive set of 11 different metrics. Our code is released at

this https URL .

Pruning CNN's with linear filter ensembles

Csanád Sándor , Szabolcs Pável , Lehel Csató

Comments: accepted to ECAI2020

Subjects

:

Machine Learning (cs.LG)

; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Despite the promising results of convolutional neural networks (CNNs),

applying them on resource limited devices is still a challenge, mainly due to

the huge memory and computation requirements. To tackle these problems, pruning

can be applied to reduce the network size and number of floating point

operations (FLOPs). Contrary to the emph{filter norm} method — that is used

in network pruning and uses the assumption that the smaller this norm, the less

important is the associated component –, we develop a novel filter importance

norm that incorporates the loss caused by the elimination of a component from

the CNN.

To estimate the importance of a set of architectural components, we measure

the CNN performance as different components are removed. The result is a

collection of filter ensembles — filter masks — and associated performance

values. We rank the filters based on a linear and additive model and remove the

least important ones such that the drop in network accuracy is minimal. We

evaluate our method on a fully connected network, as well as on the ResNet

architecture trained on the CIFAR-10 data-set. Using our pruning method, we

managed to remove (60\%) of the parameters and (64\%) of the FLOPs from the

ResNet with an accuracy drop of less than (0.6\%).

Q-Learning in enormous action spaces via amortized approximate maximization

Tom Van de Wiele , David Warde-Farley , Andriy Mnih , Volodymyr Mnih

Comments: A previous version of this work appeared at the Deep Reinforcement Learning Workshop, NeurIPS 2018

Subjects

:

Machine Learning (cs.LG)

; Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

Applying Q-learning to high-dimensional or continuous action spaces can be

difficult due to the required maximization over the set of possible actions.

Motivated by techniques from amortized inference, we replace the expensive

maximization over all actions with a maximization over a small subset of

possible actions sampled from a learned proposal distribution. The resulting

approach, which we dub Amortized Q-learning (AQL), is able to handle discrete,

continuous, or hybrid action spaces while maintaining the benefits of

Q-learning. Our experiments on continuous control tasks with up to 21

dimensional actions show that AQL outperforms D3PG (Barth-Maron et al, 2018)

and QT-Opt (Kalashnikov et al, 2018). Experiments on structured discrete action

spaces demonstrate that AQL can efficiently learn good policies in spaces with

thousands of discrete actions.

Secure and Robust Machine Learning for Healthcare: A Survey

Adnan Qayyum , Junaid Qadir , Muhammad Bilal , Ala Al-Fuqaha Subjects : Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV); Machine Learning (stat.ML)

Recent years have witnessed widespread adoption of machine learning (ML)/deep

learning (DL) techniques due to their superior performance for a variety of

healthcare applications ranging from the prediction of cardiac arrest from

one-dimensional heart signals to computer-aided diagnosis (CADx) using

multi-dimensional medical images. Notwithstanding the impressive performance of

ML/DL, there are still lingering doubts regarding the robustness of ML/DL in

healthcare settings (which is traditionally considered quite challenging due to

the myriad security and privacy issues involved), especially in light of recent

results that have shown that ML/DL are vulnerable to adversarial attacks. In

this paper, we present an overview of various application areas in healthcare

that leverage such techniques from security and privacy point of view and

present associated challenges. In addition, we present potential methods to

ensure secure and privacy-preserving ML for healthcare applications. Finally,

we provide insight into the current research challenges and promising

directions for future research.

Local Policy Optimization for Trajectory-Centric Reinforcement Learning

Patrik Kolaric , Devesh K. Jha , Arvind U. Raghunathan , Frank L. Lewis , Mouhacine Benosman , Diego Romeres , Daniel Nikovski

Journal-ref: ICRA 2020

Subjects

:

Machine Learning (cs.LG)

; Robotics (cs.RO); Systems and Control (eess.SY); Machine Learning (stat.ML)

The goal of this paper is to present a method for simultaneous trajectory and

local stabilizing policy optimization to generate local policies for

trajectory-centric model-based reinforcement learning (MBRL). This is motivated

by the fact that global policy optimization for non-linear systems could be a

very challenging problem both algorithmically and numerically. However, a lot

of robotic manipulation tasks are trajectory-centric, and thus do not require a

global model or policy. Due to inaccuracies in the learned model estimates, an

open-loop trajectory optimization process mostly results in very poor

performance when used on the real system. Motivated by these problems, we try

to formulate the problem of trajectory optimization and local policy synthesis

as a single optimization problem. It is then solved simultaneously as an

instance of nonlinear programming. We provide some results for analysis as well

as achieved performance of the proposed technique under some simplifying

assumptions.

Optimal binning: mathematical programming formulation

Guillermo Navas-Palencia Subjects : Machine Learning (cs.LG) ; Optimization and Control (math.OC); Machine Learning (stat.ML)

The optimal binning is the optimal discretization of a variable into bins

given a discrete or continuous numeric target. We present a rigorous and

extensible mathematical programming formulation to solving the optimal binning

problem for a binary, continuous and multi-class target type, incorporating

constraints not previously addressed. For all three target types, we introduce

a convex mixed-integer programming formulation. Several algorithmic

enhancements such as automatic determination of the most suitable monotonic

trend via a Machine-Learning-based classifier and implementation aspects are

thoughtfully discussed. The new mathematical programming formulations are

carefully implemented in the open-source python library OptBinning.

Safety Concerns and Mitigation Approaches Regarding the Use of Deep Learning in Safety-Critical Perception Tasks

Oliver Willers , Sebastian Sudholt , Shervin Raafatnia , Stephanie Abrecht Subjects : Machine Learning (cs.LG) ; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Deep learning methods are widely regarded as indispensable when it comes to

designing perception pipelines for autonomous agents such as robots, drones or

automated vehicles. The main reasons, however, for deep learning not being used

for autonomous agents at large scale already are safety concerns. Deep learning

approaches typically exhibit a black-box behavior which makes it hard for them

to be evaluated with respect to safety-critical aspects. While there have been

some work on safety in deep learning, most papers typically focus on high-level

safety concerns. In this work, we seek to dive into the safety concerns of deep

learning methods and present a concise enumeration on a deeply technical level.

Additionally, we present extensive discussions on possible mitigation methods

and give an outlook regarding what mitigation methods are still missing in

order to facilitate an argumentation for the safety of a deep learning method.

On Solving Cooperative MARL Problems with a Few Good Experiences

Rajiv Ranjan Kumar , Pradeep Varakantham Subjects : Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)

Cooperative Multi-agent Reinforcement Learning (MARL) is crucial for

cooperative decentralized decision learning in many domains such as search and

rescue, drone surveillance, package delivery and fire fighting problems. In

these domains, a key challenge is learning with a few good experiences, i.e.,

positive reinforcements are obtained only in a few situations (e.g., on

extinguishing a fire or tracking a crime or delivering a package) and in most

other situations there is zero or negative reinforcement. Learning decisions

with a few good experiences is extremely challenging in cooperative MARL

problems due to three reasons. First, compared to the single agent case,

exploration is harder as multiple agents have to be coordinated to receive a

good experience. Second, environment is not stationary as all the agents are

learning at the same time (and hence change policies). Third, scale of problem

increases significantly with every additional agent.

Relevant existing work is extensive and has focussed on dealing with a few

good experiences in single-agent RL problems or on scalable approaches for

handling non-stationarity in MARL problems. Unfortunately, neither of these

approaches (or their extensions) are able to address the problem of sparse good

experiences effectively. Therefore, we provide a novel fictitious self

imitation approach that is able to simultaneously handle non-stationarity and

sparse good experiences in a scalable manner. Finally, we provide a thorough

comparison (experimental or descriptive) against relevant cooperative MARL

algorithms to demonstrate the utility of our approach.

CodeReef: an open platform for portable MLOps, reusable automation actions and reproducible benchmarking

Grigori Fursin , Herve Guillou , Nicolas Essayan Subjects : Machine Learning (cs.LG) ; Software Engineering (cs.SE); Machine Learning (stat.ML)

We present CodeReef – an open platform to share all the components necessary

to enable cross-platform MLOps (MLSysOps), i.e. automating the deployment of ML

models across diverse systems in the most efficient way. We also introduce the

CodeReef solution – a way to package and share models as non-virtualized,

portable, customizable and reproducible archive files. Such ML packages include

JSON meta description of models with all dependencies, Python APIs, CLI actions

and portable workflows necessary to automatically build, benchmark, test and

customize models across diverse platforms, AI frameworks, libraries, compilers

and datasets. We demonstrate several CodeReef solutions to automatically build,

run and measure object detection based on SSD-Mobilenets, TensorFlow and COCO

dataset from the latest MLPerf inference benchmark across a wide range of

platforms from Raspberry Pi, Android phones and IoT devices to data centers.

Our long-term goal is to help researchers share their new techniques as

production-ready packages along with research papers to participate in

collaborative and reproducible benchmarking, compare the different

ML/software/hardware stacks and select the most efficient ones on a Pareto

frontier using online CodeReef dashboards.

Get Rid of Suspended Animation Problem: Deep Diffusive Neural Network on Graph Semi-Supervised Classification

Jiawei Zhang

Comments: 7 pages, 6 figures

Subjects

:

Machine Learning (cs.LG)

; Artificial Intelligence (cs.AI); Information Theory (cs.IT); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

Existing graph neural networks may suffer from the “suspended animation

problem” when the model architecture goes deep. Meanwhile, for some graph

learning scenarios, e.g., nodes with text/image attributes or graphs with

long-distance node correlations, deep graph neural networks will be necessary

for effective graph representation learning. In this paper, we propose a new

graph neural network, namely DIFNET (Graph Diffusive Neural Network), for graph

representation learning and node classification. DIFNET utilizes both neural

gates and graph residual learning for node hidden state modeling, and includes

an attention mechanism for node neighborhood information diffusion. Extensive

experiments will be done in this paper to compare DIFNET against several

state-of-the-art graph neural network models. The experimental results can

illustrate both the learning performance advantages and effectiveness of

DIFNET, especially in addressing the “suspended animation problem”.

From abstract items to latent spaces to observed data and back: Compositional Variational Auto-Encoder

Victor Berger (TAU), Michèle Sebag (LRI)

Journal-ref: ECMLPKDD 2019 : European Conference on Machine learning and

knowledge discovery in databases, Sep 2019, W{“u}rzburg, Germany

Subjects

:

Machine Learning (cs.LG)

Conditional Generative Models are now acknowledged an essential tool in

Machine Learning. This paper focuses on their control. While many approaches

aim at disentangling the data through the coordinate-wise control of their

latent representations, another direction is explored in this paper. The

proposed CompVAE handles data with a natural multi-ensemblist structure (i.e.

that can naturally be decomposed into elements). Derived from Bayesian

variational principles, CompVAE learns a latent representation leveraging both

observational and symbolic information. A first contribution of the approach is

that this latent representation supports a compositional generative model,

amenable to multi-ensemblist operations (addition or subtraction of elements in

the composition). This compositional ability is enabled by the invariance and

generality of the whole framework w.r.t. respectively, the order and number of

the elements. The second contribution of the paper is a proof of concept on

synthetic 1D and 2D problems, demonstrating the efficiency of the proposed

approach.

Incentivising Exploration and Recommendations for Contextual Bandits with Payments

Priyank Agrawal , Theja Tulabandhula

Comments: 11 pages, 4 figures

Subjects

:

Machine Learning (cs.LG)

; Information Retrieval (cs.IR); Machine Learning (stat.ML)

We propose a contextual bandit based model to capture the learning and social

welfare goals of a web platform in the presence of myopic users. By using

payments to incentivize these agents to explore different

items/recommendations, we show how the platform can learn the inherent

attributes of items and achieve a sublinear regret while maximizing cumulative

social welfare. We also calculate theoretical bounds on the cumulative costs of

incentivization to the platform. Unlike previous works in this domain, we

consider contexts to be completely adversarial, and the behavior of the

adversary is unknown to the platform. Our approach can improve various

engagement metrics of users on e-commerce stores, recommendation engines and

matching platforms.

Convergence Time Optimization for Federated Learning over Wireless Networks

Mingzhe Chen , H. Vincent Poor , Walid Saad , Shuguang Cui Subjects : Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI); Networking and Internet Architecture (cs.NI); Machine Learning (stat.ML)

In this paper, the convergence time of federated learning (FL), when deployed

over a realistic wireless network, is studied. In particular, a wireless

network is considered in which wireless users transmit their local FL models

(trained using their locally collected data) to a base station (BS). The BS,

acting as a central controller, generates a global FL model using the received

local FL models and broadcasts it back to all users. Due to the limited number

of resource blocks (RBs) in a wireless network, only a subset of users can be

selected to transmit their local FL model parameters to the BS at each learning

step. Moreover, since each user has unique training data samples, the BS

prefers to include all local user FL models to generate a converged global FL

model. Hence, the FL performance and convergence time will be significantly

affected by the user selection scheme. Therefore, it is necessary to design an

appropriate user selection scheme that enables users of higher importance to be

selected more frequently. This joint learning, wireless resource allocation,

and user selection problem is formulated as an optimization problem whose goal

is to minimize the FL convergence time while optimizing the FL performance. To

solve this problem, a probabilistic user selection scheme is proposed such that

the BS is connected to the users whose local FL models have significant effects

on its global FL model with high probabilities. Given the user selection

policy, the uplink RB allocation can be determined. To further reduce the FL

convergence time, artificial neural networks (ANNs) are used to estimate the

local FL models of the users that are not allocated any RBs for local FL model

transmission at each given learning step, which enables the BS to enhance its

global FL model and improve the FL convergence speed and performance.

Coarse-Grain Cluster Analysis of Tensors With Application to Climate Biome Identification

Derek DeSantis , Phillip J. Wolfram , Katrina Bennett , Boian Alexandrov Subjects : Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI); Information Theory (cs.IT); Machine Learning (stat.ML)

A tensor provides a concise way to codify the interdependence of complex

data. Treating a tensor as a d-way array, each entry records the interaction

between the different indices. Clustering provides a way to parse the

complexity of the data into more readily understandable information. Clustering

methods are heavily dependent on the algorithm of choice, as well as the chosen

hyperparameters of the algorithm. However, their sensitivity to data scales is

largely unknown.

In this work, we apply the discrete wavelet transform to analyze the effects

of coarse-graining on clustering tensor data. We are particularly interested in

understanding how scale effects clustering of the Earth’s climate system. The

discrete wavelet transform allows classification of the Earth’s climate across

a multitude of spatial-temporal scales. The discrete wavelet transform is used

to produce an ensemble of classification estimates, as opposed to a single

classification. Using information theory, we discover a sub-collection of the

ensemble that span the majority of the variance observed, allowing for

efficient consensus clustering techniques that can be used to identify climate

biomes.

Loss-annealed GAIL for sample efficient and stable Imitation Learning

Rohit Jena , Katia Sycara Subjects : Machine Learning (cs.LG) ; Machine Learning (stat.ML)

Imitation learning is the problem of learning a policy from an expert policy

without access to a reward signal. Often, the expert policy is only available

in the form of expert demonstrations. Behavior cloning and GAIL are two

popularly used methods for performing imitation learning in this setting.

Behavior cloning converges in a few training iterations, but doesn’t reach peak

performance and suffers from compounding errors due to its supervised training

framework and iid assumption. GAIL attempts to tackle this problem by

accounting for the temporal dependencies between states while matching

occupancy measures of the expert and the policy. Although GAIL has shown

successes in a number of environments, it takes a lot of environment

interactions. Given their complementary benefits, existing methods have

mentioned trying or tried to combine the two methods, without much success. We

look at some of the limitations of existing ideas that try to combine BC and

GAIL, and present an algorithm that combines the best of both worlds to enable

faster and stable training while not compromising on performance. Our algorithm

is embarrassingly simple to implement and seamlessly integrates with different

policy gradient algorithms. We demonstrate the effectiveness of the algorithm

both in low dimensional control tasks in a limited data setting, and in high

dimensional grid world environments.

Massif: Interactive Interpretation of Adversarial Attacks on Deep Learning

Nilaksh Das , Haekyu Park , Zijie J. Wang , Fred Hohman , Robert Firstman , Emily Rogers , Duen Horng (Polo)

Chau

Comments: 7 pages

Subjects

:

Machine Learning (cs.LG)

; Cryptography and Security (cs.CR); Machine Learning (stat.ML)

Deep neural networks (DNNs) are increasingly powering high-stakes

applications such as autonomous cars and healthcare; however, DNNs are often

treated as “black boxes” in such applications. Recent research has also

revealed that DNNs are highly vulnerable to adversarial attacks, raising

serious concerns over deploying DNNs in the real world. To overcome these

deficiencies, we are developing Massif, an interactive tool for deciphering

adversarial attacks. Massif identifies and interactively visualizes neurons and

their connections inside a DNN that are strongly activated or suppressed by an

adversarial attack. Massif provides both a high-level, interpretable overview

of the effect of an attack on a DNN, and a low-level, detailed description of

the affected neurons. These tightly coupled views in Massif help people better

understand which input features are most vulnerable or important for correct

predictions.

Improving Label Ranking Ensembles using Boosting Techniques

Lihi Dery , Erez Shmueli Subjects : Machine Learning (cs.LG) ; Machine Learning (stat.ML)

Label ranking is a prediction task which deals with learning a mapping

between an instance and a ranking (i.e., order) of labels from a finite set,

representing their relevance to the instance. Boosting is a well-known and

reliable ensemble technique that was shown to often outperform other learning

algorithms. While boosting algorithms were developed for a multitude of machine

learning tasks, label ranking tasks were overlooked. In this paper, we propose

a boosting algorithm which was specifically designed for label ranking tasks.

Extensive evaluation of the proposed algorithm on 24 semi-synthetic and

real-world label ranking datasets shows that it significantly outperforms

existing state-of-the-art label ranking algorithms.

Automatic phantom test pattern classification through transfer learning with deep neural networks

Rafael B. Fricks , Justin Solomon , Ehsan Samei Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Medical Physics (physics.med-ph)

Imaging phantoms are test patterns used to measure image quality in computer

tomography (CT) systems. A new phantom platform (Mercury Phantom, Gammex)

provides test patterns for estimating the task transfer function (TTF) or noise

power spectrum (NPF) and simulates different patient sizes. Determining which

image slices are suitable for analysis currently requires manual annotation of

these patterns by an expert, as subtle defects may make an image unsuitable for

measurement. We propose a method of automatically classifying these test

patterns in a series of phantom images using deep learning techniques. By

adapting a convolutional neural network based on the VGG19 architecture with

weights trained on ImageNet, we use transfer learning to produce a classifier

for this domain. The classifier is trained and evaluated with over 3,500

phantom images acquired at a university medical center. Input channels for

color images are successfully adapted to convey contextual information for

phantom images. A series of ablation studies are employed to verify design

aspects of the classifier and evaluate its performance under varying training

conditions. Our solution makes extensive use of image augmentation to produce a

classifier that accurately classifies typical phantom images with 98% accuracy,

while maintaining as much as 86% accuracy when the phantom is improperly

imaged.

Discovering Salient Anatomical Landmarks by Predicting Human Gaze

Richard Droste , Pierre Chatelain , Lior Drukker , Harshita Sharma , Aris T. Papageorghiou , J. Alison Noble

Comments: Accepted at IEEE International Symposium on Biomedical Imaging 2020 (ISBI 2020)

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Anatomical landmarks are a crucial prerequisite for many medical imaging

tasks. Usually, the set of landmarks for a given task is predefined by experts.

The landmark locations for a given image are then annotated manually or via

machine learning methods trained on manual annotations. In this paper, in

contrast, we present a method to automatically discover and localize anatomical

landmarks in medical images. Specifically, we consider landmarks that attract

the visual attention of humans, which we term visually salient landmarks. We

illustrate the method for fetal neurosonographic images. First, full-length

clinical fetal ultrasound scans are recorded with live sonographer

gaze-tracking. Next, a convolutional neural network (CNN) is trained to predict

the gaze point distribution (saliency map) of the sonographers on scan video

frames. The CNN is then used to predict saliency maps of unseen fetal

neurosonographic images, and the landmarks are extracted as the local maxima of

these saliency maps. Finally, the landmarks are matched across images by

clustering the landmark CNN features. We show that the discovered landmarks can

be used within affine image registration, with average landmark alignment

errors between 4.1% and 10.9% of the fetal head long axis length.

A utility-based analysis of equilibria in multi-objective normal form games

Roxana Rădulescu , Patrick Mannion , Yijie Zhang , Diederik M. Roijers , Ann Nowé

Comments: Under review since 16 January 2020

Subjects

:

Computer Science and Game Theory (cs.GT)

; Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA)

In multi-objective multi-agent systems (MOMAS), agents explicitly consider

the possible tradeoffs between conflicting objective functions. We argue that

compromises between competing objectives in MOMAS should be analysed on the

basis of the utility that these compromises have for the users of a system,

where an agent’s utility function maps their payoff vectors to scalar utility

values. This utility-based approach naturally leads to two different

optimisation criteria for agents in a MOMAS: expected scalarised returns (ESR)

and scalarised expected returns (SER). In this article, we explore the

differences between these two criteria using the framework of multi-objective

normal form games (MONFGs). We demonstrate that the choice of optimisation

criterion (ESR or SER) can radically alter the set of equilibria in a MONFG

when non-linear utility functions are used.

AppStreamer: Reducing Storage Requirements of Mobile Games through Predictive Streaming

Nawanol Theera-Ampornpunt , Shikhar Suryavansh , Sameer Manchanda , Rajesh Panta , Kaustubh Joshi , Mostafa Ammar , Mung Chiang , Saurabh Bagchi

Comments: 12 pages; EWSN 2020

Subjects

:

Operating Systems (cs.OS)

; Machine Learning (cs.LG); Machine Learning (stat.ML)

Storage has become a constrained resource on smartphones. Gaming is a popular

activity on mobile devices and the explosive growth in the number of games

coupled with their growing size contributes to the storage crunch. Even where

storage is plentiful, it takes a long time to download and install a heavy app

before it can be launched. This paper presents AppStreamer, a novel technique

for reducing the storage requirements or startup delay of mobile games, and

heavy mobile apps in general. AppStreamer is based on the intuition that most

apps do not need the entirety of its files (images, audio and video clips,

etc.) at any one time. AppStreamer can, therefore, keep only a small part of

the files on the device, akin to a “cache”, and download the remainder from a

cloud storage server or a nearby edge server when it predicts that the app will

need them in the near future. AppStreamer continuously predicts file blocks for

the near future as the user uses the app, and fetches them from the storage

server before the user sees a stall due to missing resources. We implement

AppStreamer at the Android file system layer. This ensures that the apps

require no source code or modification, and the approach generalizes across

apps. We evaluate AppStreamer using two popular games: Dead Effect 2, a 3D

first-person shooter, and Fire Emblem Heroes, a 2D turn-based strategy

role-playing game. Through a user study, 75% and 87% of the users respectively

find that AppStreamer provides the same quality of user experience as the

baseline where all files are stored on the device. AppStreamer cuts down the

storage requirement by 87% for Dead Effect 2 and 86% for Fire Emblem Heroes.

Live Anomaly Detection based on Machine Learning Techniques SAD-F: Spark Based Anomaly Detection Framework

Awais Ahmed , Sufian Hameed , Muhammad Rafi , Qublai Khan Ali Mirza Subjects : Cryptography and Security (cs.CR) ; Machine Learning (cs.LG); Machine Learning (stat.ML)

Anomaly detection is a crucial step for preventing malicious activities in

the network and keeping resources available all the time for legitimate users.

It is noticed from various studies that classical anomaly detectors work well

with small and sampled data, but the chances of failures increase with

real-time (non-sampled data) traffic data. In this paper, we will be exploring

security analytic techniques for DDoS anomaly detection using different machine

learning techniques. In this paper, we are proposing a novel approach which

deals with real traffic as input to the system. Further, we study and compare

the performance factor of our proposed framework on three different testbeds

including normal commodity hardware, low-end system, and high-end system.

Hardware details of testbeds are discussed in the respective section. Further

in this paper, we investigate the performance of the classifiers in (near)

real-time detection of anomalies attacks. This study also focused on the

feature selection process that is as important for the anomaly detection

process as it is for general modeling problems. Several techniques have been

studied for feature selection and it is observed that proper feature selection

can increase performance in terms of model’s execution time – which totally

depends upon the traffic file or traffic capturing process.

Unsupervised Domain Adaptation for Neural Machine Translation with Iterative Back Translation

Di Jin , Zhijing Jin , Joey Tianyi Zhou , Peter Szolovits

Comments: Submitted to IJCAI 2020

Subjects

:

Computation and Language (cs.CL)

; Machine Learning (cs.LG)

State-of-the-art neural machine translation (NMT) systems are data-hungry and

perform poorly on domains with little supervised data. As data collection is

expensive and infeasible in many cases, unsupervised domain adaptation methods

are needed. We apply an Iterative Back Translation (IBT) training scheme on

in-domain monolingual data, which repeatedly uses a Transformer-based NMT model

to create in-domain pseudo-parallel sentence pairs in one translation direction

on the fly and then use them to train the model in the other direction.

Evaluated on three domains of German-to-English translation task with no

supervised data, this simple technique alone (without any out-of-domain

parallel data) can already surpass all previous domain adaptation methods—up

to +9.48 BLEU over the strongest previous method, and up to +27.77 BLEU over

the unadapted baseline. Moreover, given available supervised out-of-domain data

on German-to-English and Romanian-to-English language pairs, we can further

enhance the performance and obtain up to +19.31 BLEU improvement over the

strongest baseline, and +47.69 BLEU increment against the unadapted model.

Optimizing Generative Adversarial Networks for Image Super Resolution via Latent Space Regularization

Sheng Zhong , Shifu Zhou (Agora.io)

Comments: 11 pages, 5 figures

Subjects

:

Image and Video Processing (eess.IV)

; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Natural images can be regarded as residing in a manifold that is embedded in

a higher dimensional Euclidean space. Generative Adversarial Networks (GANs)

try to learn the distribution of the real images in the manifold to generate

samples that look real. But the results of existing methods still exhibit many

unpleasant artifacts and distortions even for the cases where the desired

ground truth target images are available for supervised learning such as in

single image super resolution (SISR). We probe for ways to alleviate these

problems for supervised GANs in this paper. We explicitly apply the Lipschitz

Continuity Condition (LCC) to regularize the GAN. An encoding network that maps

the image space to a new optimal latent space is derived from the LCC, and it

is used to augment the GAN as a coupling component. The LCC is also converted

to new regularization terms in the generator loss function to enforce local

invariance. The GAN is optimized together with the encoding network in an

attempt to make the generator converge to a more ideal and disentangled mapping

that can generate samples more faithful to the target images. When the proposed

models are applied to the single image super resolution problem, the results

outperform the state of the art.

DDKSP: A Data-Driven Stochastic Programming Framework for Car-Sharing Relocation Problem

Xiaoming Li , Chun Wang , Xiao Huang

Comments: arXiv admin note: text overlap with arXiv:1909.09293

Subjects

:

Optimization and Control (math.OC)

; Machine Learning (cs.LG); Signal Processing (eess.SP); Applications (stat.AP)

Car-sharing issue is a popular research field in sharing economy. In this

paper, we investigate the car-sharing relocation problem (CSRP) under uncertain

demands. Normally, the real customer demands follow complicating probability

distribution which cannot be described by parametric approaches. In order to

overcome the problem, an innovative framework called Data-Driven Kernel

Stochastic Programming (DDKSP) that integrates a non-parametric approach –

kernel density estimation (KDE) and a two-stage stochastic programming (SP)

model is proposed. Specifically, the probability distributions are derived from

historical data by KDE, which are used as the input uncertain parameters for

SP. Additionally, the CSRP is formulated as a two-stage SP model. Meanwhile, a

Monte Carlo method called sample average approximation (SAA) and Benders

decomposition algorithm are introduced to solve the large-scale optimization

model. Finally, the numerical experimental validations which are based on New

York taxi trip data sets show that the proposed framework outperforms the pure

parametric approaches including Gaussian, Laplace and Poisson distributions

with 3.72% , 4.58% and 11% respectively in terms of overall profits.

PDS: Deduce Elder Privacy from Smart Homes

Ming-Chang Lee , Jia-Chun Lin , Olaf Owe

Comments: 31 pages, 23 figures, and 2 tables, journal paper. arXiv admin note: text overlap with arXiv:1808.07379

Journal-ref: Internet of Things, 7, 1000072, 2019

Subjects

:

Cryptography and Security (cs.CR)

; Machine Learning (cs.LG)

With the development of IoT technologies in the past few years, a wide range

of smart devices are deployed in a variety of environments aiming to improve

the quality of human life in a cost efficient way. Due to the increasingly

serious aging problem around the world, smart homes for elder healthcare have

become an important IoT-based application, which not only enables elders’

health to be properly monitored and taken care of, but also allows them to live

more comfortably and independently in their houses. However, elders’ privacy

might be disclosed from smart homes due to non-fully protected network

communication. To show that elders’ privacy could be substantially exposed, in

this paper we develop a Privacy Deduction Scheme (PDS for short) by

eavesdropping sensor traffic from a smart home to identify elders’ movement

activities and speculating sensor locations in the smart home based on a series

of deductions from the viewpoint of an attacker. The experimental results based

on sensor datasets from real smart homes demonstrate the effectiveness of PDS

in deducing and disclosing elders’ privacy, which might be maliciously

exploited by attackers to endanger elders and their properties.

Stratified cross-validation for unbiased and privacy-preserving federated learning

R. Bey , R. Goussault , M. Benchoufi , R. Porcher

Comments: 13 pages, 5 figures

Subjects

:

Machine Learning (stat.ML)

; Machine Learning (cs.LG); Methodology (stat.ME)

Large-scale collections of electronic records constitutes both an opportunity

for the development of more accurate prediction models and a threat for

privacy. To limit privacy exposure new privacy-enhancing techniques are

emerging such as federated learning which enables large-scale data analysis

while avoiding the centralization of records in a unique database that would

represent a critical point of failure. Although promising regarding privacy

protection, federated learning prevents using some data-cleaning algorithms

thus inducing new biases. In this work we focus on the recurrent problem of

duplicated records that, if not handled properly, may cause over-optimistic

estimations of a model’s performances. We introduce and discuss stratified

cross-validation, a validation methodology that leverages stratification

techniques to prevent data leakage in federated learning settings without

relying on demanding deduplication algorithms.

Training Neural Network Controllers Using Control Barrier Functions in the Presence of Disturbances

Shakiba Yaghoubi , Georgios Fainekos , Sriram Sankaranarayanan Subjects : Optimization and Control (math.OC) ; Machine Learning (cs.LG); Systems and Control (eess.SY); Machine Learning (stat.ML)

Control Barrier Functions (CBF) have been recently utilized in the design of

provably safe feedback control laws for nonlinear systems. These feedback

control methods typically compute the next control input by solving an online

Quadratic Program (QP). Solving QP in real-time can be a computationally

expensive process for resource constraint systems. In this work, we propose to

use imitation learning to learn Neural Network-based feedback controllers which

will satisfy the CBF constraints. In the process, we also develop a new class

of High Order CBF for systems under external disturbances. We demonstrate the

framework on a unicycle model subject to external disturbances, e.g., wind or

currents.

Active Learning over DNN: Automated Engineering Design Optimization for Fluid Dynamics Based on Self-Simulated Dataset

Yang Chen Subjects : Computational Engineering, Finance, and Science (cs.CE) ; Machine Learning (cs.LG); Machine Learning (stat.ML)

Optimizing fluid-dynamic performance is an important engineering task.

Traditionally, experts design shapes based on empirical estimations and verify

them through expensive experiments. This costly process, both in terms of time

and space, may only explore a limited number of shapes and lead to sub-optimal

designs. In this research, a test-proven deep learning architecture is applied

to predict the performance under various restrictions and search for better

shapes by optimizing the learned prediction function. The major challenge is

the vast amount of data points Deep Neural Network (DNN) demands, which is

improvident to simulate. To remedy this drawback, a Frequentist active learning

is used to explore regions of the output space that DNN predicts promising.

This operation reduces the number of data samples demanded from ~8000 to 625.

The final stage, a user interface, made the model capable of optimizing with

given user input of minimum area and viscosity. Flood fill is used to define a

boundary area function so that the optimal shape does not bypass the minimum

area. Stochastic Gradient Langevin Dynamics (SGLD) is employed to make sure the

ultimate shape is optimized while circumventing the required area. Jointly,

shapes with extremely low drags are found explored by a practical user

interface with no human domain knowledge and modest computation overhead.

ESRGAN+ : Further Improving Enhanced Super-Resolution Generative Adversarial Network

Nathanaël Carraz Rakotonirina , Andry Rasoanaivo Subjects : Image and Video Processing (eess.IV) ; Machine Learning (cs.LG)

Enhanced Super-Resolution Generative Adversarial Network (ESRGAN) is a

perceptual-driven approach for single image super resolution that is able to

produce photorealistic images. Despite the visual quality of these generated

images, there is still room for improvement. In this fashion, the model is

extended to further improve the perceptual quality of the images. We have

designed a novel block to replace the one used by the original ESRGAN.

Moreover, we introduce noise inputs to the generator network in order to

exploit stochastic variation. The resulting images present more realistic

textures.

Up to two billion times acceleration of scientific simulations with deep neural architecture search

M. F. Kasim , D. Watson-Parris , L. Deaconu , S. Oliver , P. Hatfield , D. H. Froula , G. Gregori , M. Jarvis , S. Khatiwala , J. Korenaga , J. Topp-Mugglestone , E. Viezzer , S. M. Vinko Subjects : Machine Learning (stat.ML) ; Machine Learning (cs.LG); Atmospheric and Oceanic Physics (physics.ao-ph); Computational Physics (physics.comp-ph); Plasma Physics (physics.plasm-ph)

Computer simulations are invaluable tools for scientific discovery. However,

accurate simulations are often slow to execute, which limits their

applicability to extensive parameter exploration, large-scale data analysis,

and uncertainty quantification. A promising route to accelerate simulations by

building fast emulators with machine learning requires large training datasets,

which can be prohibitively expensive to obtain with slow simulations. Here we

present a method based on neural architecture search to build accurate

emulators even with a limited number of training data. The method successfully

accelerates simulations by up to 2 billion times in 10 scientific cases

including astrophysics, climate science, biogeochemistry, high energy density

physics, fusion energy, and seismology, using the same super-architecture,

algorithm, and hyperparameters. Our approach also inherently provides emulator

uncertainty estimation, adding further confidence in their use. We anticipate

this work will accelerate research involving expensive simulations, allow more

extensive parameters exploration, and enable new, previously unfeasible

computational discovery.

Contextualized Embeddings in Named-Entity Recognition: An Empirical Study on Generalization

Bruno Taillé , Vincent Guigue , Patrick Gallinari

Journal-ref: ECIR 2020

Subjects

:

Computation and Language (cs.CL)

; Machine Learning (cs.LG)

Contextualized embeddings use unsupervised language model pretraining to

compute word representations depending on their context. This is intuitively

useful for generalization, especially in Named-Entity Recognition where it is

crucial to detect mentions never seen during training. However, standard

English benchmarks overestimate the importance of lexical over contextual

features because of an unrealistic lexical overlap between train and test

mentions. In this paper, we perform an empirical analysis of the generalization

capabilities of state-of-the-art contextualized embeddings by separating

mentions by novelty and with out-of-domain evaluation. We show that they are

particularly beneficial for unseen mentions detection, especially

out-of-domain. For models trained on CoNLL03, language model contextualization

leads to a +1.2% maximal relative micro-F1 score increase in-domain against

+13% out-of-domain on the WNUT dataset

On Last-Layer Algorithms for Classification: Decoupling Representation from Uncertainty Estimation

Nicolas Brosse , Carlos Riquelme , Alice Martin , Sylvain Gelly , Éric Moulines Subjects : Machine Learning (stat.ML) ; Machine Learning (cs.LG)

Uncertainty quantification for deep learning is a challenging open problem.

Bayesian statistics offer a mathematically grounded framework to reason about

uncertainties; however, approximate posteriors for modern neural networks still

require prohibitive computational costs. We propose a family of algorithms

which split the classification task into two stages: representation learning

and uncertainty estimation. We compare four specific instances, where

uncertainty estimation is performed via either an ensemble of Stochastic

Gradient Descent or Stochastic Gradient Langevin Dynamics snapshots, an

ensemble of bootstrapped logistic regressions, or via a number of Monte Carlo

Dropout passes. We evaluate their performance in terms of emph{selective}

classification (risk-coverage), and their ability to detect out-of-distribution

samples. Our experiments suggest there is limited value in adding multiple

uncertainty layers to deep classifiers, and we observe that these simple

methods strongly outperform a vanilla point-estimate SGD in some complex

benchmarks like ImageNet.

Attention! A Lightweight 2D Hand Pose Estimation Approach

Nicholas Santavas , Ioannis Kansizoglou , Loukas Bampis , Evangelos Karakasis , Antonios Gasteratos

Comments: submitted to IEEE Signal Processing Letters

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)

Vision based human pose estimation is an non-invasive technology for

Human-Computer Interaction (HCI). Direct use of the hand as an input device

provides an attractive interaction method, with no need for specialized sensing

equipment, such as exoskeletons, gloves etc, but a camera. Traditionally, HCI

is employed in various applications spreading in areas including manufacturing,

surgery, entertainment industry and architecture, to mention a few. Deployment

of vision based human pose estimation algorithms can give a breath of

innovation to these applications. In this letter, we present a novel

Convolutional Neural Network architecture, reinforced with a Self-Attention

module that it can be deployed on an embedded system, due to its lightweight

nature, with just 1.9 Million parameters. The source code and qualitative

results are publicly available.

Machine Learning for Network Slicing Resource Management: A Comprehensive Survey

Bin Han , Hans D. Schotten

Comments: To appear in ZTE Communications, 2020

Subjects

:

Networking and Internet Architecture (cs.NI)

; Machine Learning (cs.LG)

The emerging technology of multi-tenancy network slicing is considered as an

essential feature of 5G cellular networks. It provides network slices as a new

type of public cloud services, and therewith increases the service flexibility

and enhances the network resource efficiency. Meanwhile, it raises new

challenges of network resource management. A number of various methods have

been proposed over the recent past years, in which machine learning and

artificial intelligence techniques are widely deployed. In this article, we

provide a survey to existing approaches of network slicing resource management,

with a highlight on the roles played by machine learning in them.

On Simple Reactive Neural Networks for Behaviour-Based Reinforcement Learning

Ameya Pore , Gerardo Aragon-Camarasa

Comments: 6 pages, 5 figures

Subjects

:

Robotics (cs.RO)

; Machine Learning (cs.LG)

We present a behaviour-based reinforcement learning approach, inspired by

Brook’s subsumption architecture, in which simple fully connected networks are

trained as reactive behaviours. Our working assumption is that a pick and place

robotic task can be simplified by leveraging domain knowledge of a robotics

developer to decompose and train such reactive behaviours; namely, approach,

grasp, and retract. Then the robot autonomously learns how to combine them via

an Actor-Critic architecture. The Actor-Critic policy is to determine the

activation and inhibition mechanisms of the reactive behaviours in a particular

temporal sequence. We validate our approach in a simulated robot environment

where the task is picking a block and taking it to a target position while

orienting the gripper from a top grasp. The latter represents an extra

degree-of-freedom of which current end-to-end reinforcement learning fail to

generalise. Our findings suggest that robotic learning can be more effective if

each behaviour is learnt in isolation and then combined them to accomplish the

task. That is, our approach learns the pick and place task in 8,000 episodes,

which represents a drastic reduction in the number of training episodes

required by an end-to-end approach and the existing state-of-the-art

algorithms.

Machine Learning assisted Handover and Resource Management for Cellular Connected Drones

Amin Azari , Fayezeh Ghavimi , Mustafa Ozger , Riku Jantti , Cicek Cavdar Subjects : Signal Processing (eess.SP) ; Machine Learning (cs.LG); Machine Learning (stat.ML)

Enabling cellular connectivity for drones introduces a wide set of challenges

and opportunities. Communication of cellular-connected drones is influenced by

3-dimensional mobility and line-of-sight channel characteristics which results

in higher number of handovers with increasing altitude. Our cell planning

simulations in coexistence of aerial and terrestrial users indicate that the

severe interference from drones to base stations is a major challenge for

uplink communications of terrestrial users. Here, we first present the major

challenges in co-existence of terrestrial and drone communications by

considering real geographical network data for Stockholm. Then, we derive

analytical models for the key performance indicators (KPIs), including

communications delay and interference over cellular networks, and formulate the

handover and radio resource management (H-RRM) optimization problem.

Afterwards, we transform this problem into a machine learning problem, and

propose a deep reinforcement learning solution to solve H-RRM problem. Finally,

using simulation results, we present how the speed and altitude of drones, and

the tolerable level of interference, shape the optimal H-RRM policy in the

network. Especially, the heat-maps of handover decisions in different drone’s

altitudes/speeds have been presented, which promote a revision of the legacy

handover schemes and redefining the boundaries of cells in the sky.

Adversarial Attack on Community Detection by Hiding Individuals

Jia Li , Honglei Zhang , Zhichao Han , Yu Rong , Hong Cheng , Junzhou Huang

Comments: In Proceedings of The Web Conference 2020, April 20-24, 2020, Taipei, Taiwan. 11 pages

Subjects

:

Social and Information Networks (cs.SI)

; Cryptography and Security (cs.CR); Machine Learning (cs.LG); Machine Learning (stat.ML)

It has been demonstrated that adversarial graphs, i.e., graphs with

imperceptible perturbations added, can cause deep graph models to fail on

node/graph classification tasks. In this paper, we extend adversarial graphs to

the problem of community detection which is much more difficult. We focus on

black-box attack and aim to hide targeted individuals from the detection of

deep graph community detection models, which has many applications in

real-world scenarios, for example, protecting personal privacy in social

networks and understanding camouflage patterns in transaction networks. We

propose an iterative learning framework that takes turns to update two modules:

one working as the constrained graph generator and the other as the surrogate

community detection model. We also find that the adversarial graphs generated

by our method can be transferred to other learning based community detection

models.

Reinforcement Learning Based Vehicle-cell Association Algorithm for Highly Mobile Millimeter Wave Communication

Hamza Khan , Anis Elgabli , Sumudu Samarakoon , Mehdi Bennis , Choong Seon Hong

Comments: 13 pages, 14 figures

Journal-ref: IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, VOL.

5, NO. 4, DECEMBER 2019

Subjects

:

Networking and Internet Architecture (cs.NI)

; Machine Learning (cs.LG)

Vehicle-to-everything (V2X) communication is a growing area of communication

with a variety of use cases. This paper investigates the problem of

vehicle-cell association in millimeter wave (mmWave) communication networks.

The aim is to maximize the time average rate per vehicular user (VUE) while

ensuring a target minimum rate for all VUEs with low signaling overhead. We

first formulate the user (vehicle) association problem as a discrete non-convex

optimization problem. Then, by leveraging tools from machine learning,

specifically distributed deep reinforcement learning (DDRL) and the

asynchronous actor critic algorithm (A3C), we propose a low complexity

algorithm that approximates the solution of the proposed optimization problem.

The proposed DDRL-based algorithm endows every road side unit (RSU) with a

local RL agent that selects a local action based on the observed input state.

Actions of different RSUs are forwarded to a central entity, that computes a

global reward which is then fed back to RSUs. It is shown that each

independently trained RL performs the vehicle-RSU association action with low

control overhead and less computational complexity compared to running an

online complex algorithm to solve the non-convex optimization problem. Finally,

simulation results show that the proposed solution achieves up to 15\% gains in

terms of sum rate and 20\% reduction in VUE outages compared to several

baseline designs.

Normalization of Input-output Shared Embeddings in Text Generation Models

Jinyang Liu , Yujia Zhai , Zizhong Chen Subjects : Computation and Language (cs.CL) ; Machine Learning (cs.LG)

Neural Network based models have been state-of-the-art models for various

Natural Language Processing tasks, however, the input and output dimension

problem in the networks has still not been fully resolved, especially in text

generation tasks (e.g. Machine Translation, Text Summarization), in which input

and output both have huge sizes of vocabularies. Therefore, input-output

embedding weight sharing has been introduced and adopted widely, which remains

to be improved. Based on linear algebra and statistical theories, this paper

locates the shortcoming of existed input-output embedding weight sharing

method, then raises methods for improving input-output weight shared embedding,

among which methods of normalization of embedding weight matrices show best

performance. These methods are nearly computational cost-free, can get combined

with other embedding techniques, and show good effectiveness when applied on

state-of-the-art Neural Network models. For Transformer-big models, the

normalization techniques can get at best 0.6 BLEU improvement compared to the

original version of model on WMT’16 En-De dataset, and similar BLEU

improvements on IWSLT 14′ datasets. For DynamicConv models, 0.5 BLEU

improvement can be attained on WMT’16 En-De dataset, and 0.41 BLEU improvement

on IWSLT 14′ De-En translation task is achieved.

Keyword-based Topic Modeling and Keyword Selection

Xingyu Wang , Lida Zhang , Diego Klabjan Subjects : Machine Learning (stat.ML) ; Information Retrieval (cs.IR); Machine Learning (cs.LG)

Certain type of documents such as tweets are collected by specifying a set of

keywords. As topics of interest change with time it is beneficial to adjust

keywords dynamically. The challenge is that these need to be specified ahead of

knowing the forthcoming documents and the underlying topics. The future topics

should mimic past topics of interest yet there should be some novelty in them.

We develop a keyword-based topic model that dynamically selects a subset of

keywords to be used to collect future documents. The generative process first

selects keywords and then the underlying documents based on the specified

keywords. The model is trained by using a variational lower bound and

stochastic gradient optimization. The inference consists of finding a subset of

keywords where given a subset the model predicts the underlying topic-word

matrix for the unknown forthcoming documents. We compare the keyword topic

model against a benchmark model using viral predictions of tweets combined with

a topic model. The keyword-based topic model outperforms this sophisticated

baseline model by 67%.

Optimal estimation of sparse topic models

Xin Bing , Florentina Bunea , Marten Wegkamp Subjects : Machine Learning (stat.ML) ; Information Retrieval (cs.IR); Machine Learning (cs.LG)

Topic models have become popular tools for dimension reduction and

exploratory analysis of text data which consists in observed frequencies of a

vocabulary of (p) words in (n) documents, stored in a (p imes n) matrix. The

main premise is that the mean of this data matrix can be factorized into a

product of two non-negative matrices: a (p imes K) word-topic matrix (A) and a

(K imes n) topic-document matrix (W). This paper studies the estimation of (A)

that is possibly element-wise sparse, and the number of topics (K) is unknown.

In this under-explored context, we derive a new minimax lower bound for the

estimation of such (A) and propose a new computationally efficient algorithm

for its recovery. We derive a finite sample upper bound for our estimator, and

show that it matches the minimax lower bound in many scenarios. Our estimate

adapts to the unknown sparsity of (A) and our analysis is valid for any finite

(n), (p), (K) and document lengths. Empirical results on both synthetic data

and semi-synthetic data show that our proposed estimator is a strong competitor

of the existing state-of-the-art algorithms for both non-sparse (A) and sparse

(A), and has superior performance is many scenarios of interest.

A Deep Learning Algorithm for High-Dimensional Exploratory Item Factor Analysis

Christopher J. Urban , Daniel J. Bauer

Comments: 31 pages, 9 figures

Subjects

:

Methodology (stat.ME)

; Machine Learning (cs.LG); Machine Learning (stat.ML)

Deep learning methods are the gold standard for non-linear statistical

modeling in computer vision and in natural language processing but are rarely

used in psychometrics. To bridge this gap, we present a novel deep learning

algorithm for exploratory item factor analysis (IFA). Our approach combines a

deep artificial neural network (ANN) model called a variational autoencoder

(VAE) with recent work that uses regularization for exploratory factor

analysis. We first provide overviews of ANNs and VAEs. We then describe how to

conduct exploratory IFA with a VAE and demonstrate our approach in two

empirical examples and in two simulated examples. Our empirical results were

consistent with existing psychological theory across random starting values.

Our simulations suggest that the VAE consistently recovers the data generating

factor pattern with moderate-sized samples. Secondary loadings were

underestimated with a complex factor structure and intercept parameter

estimates were moderately biased with both simple and complex factor

structures. All models converged in minutes, even with hundreds of thousands of

observations, hundreds of items, and tens of factors. We conclude that the VAE

offers a powerful new approach to fitting complex statistical models in

psychological and educational measurement.

Unsupervised Representation Disentanglement using Cross Domain Features and Adversarial Learning in Variational Autoencoder based Voice Conversion

Wen-Chin Huang , Hao Luo , Hsin-Te Hwang , Chen-Chou Lo , Yu-Huai Peng , Yu Tsao , Hsin-Min Wang

Comments: Accepted to IEEE Transactions on Emerging Topics in Computational Intelligence

Subjects

:

Audio and Speech Processing (eess.AS)

; Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)

An effective approach for voice conversion (VC) is to disentangle linguistic

content from other components in the speech signal. The effectiveness of

variational autoencoder (VAE) based VC (VAE-VC), for instance, strongly relies

on this principle. In our prior work, we proposed a cross-domain VAE-VC

(CDVAE-VC) framework, which utilized acoustic features of different properties,

to improve the performance of VAE-VC. We believed that the success came from

more disentangled latent representations. In this paper, we extend the CDVAE-VC

framework by incorporating the concept of adversarial learning, in order to

further increase the degree of disentanglement, thereby improving the quality

and similarity of converted speech. More specifically, we first investigate the

effectiveness of incorporating the generative adversarial networks (GANs) with

CDVAE-VC. Then, we consider the concept of domain adversarial training and add

an explicit constraint to the latent representation, realized by a speaker

classifier, to explicitly eliminate the speaker information that resides in the

latent code. Experimental results confirm that the degree of disentanglement of

the learned latent representation can be enhanced by both GANs and the speaker

classifier. Meanwhile, subjective evaluation results in terms of quality and

similarity scores demonstrate the effectiveness of our proposed methods.

Anomaly detection in chest radiographs with a weakly supervised flow-based deep learning method

H. Shibata (1), S. Hanaoka (2), Y. Nomura (1), T. Nakao (3), I. Sato (2 and 4 and 5), N. Hayashi (1), O. Abe (2 and 3) ((1) Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, (2) Department of Radiology, The University of Tokyo Hospital, (3) Division of Radiology and Biomedical Engineering, Graduate School of Medicine, The University of Tokyo, (4) Department of Complexity Science and Engineering, Graduate School of Frontier Sciences, The University of Tokyo, (5) Center for Advanced Intelligence Project, RIKEN) Subjects : Image and Video Processing (eess.IV) ; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Preventing the oversight of anomalies in chest X-ray radiographs (CXRs)

during diagnosis is a crucial issue. Deep learning (DL)-based anomaly detection

methods are rapidly growing in popularity, and provide effective solutions to

the problem, but the workload in labeling CXRs during the training procedure

remains heavy. To reduce the workload, a novel anomaly detection method for

CXRs based on weakly supervised DL is presented in this study. The DL is based

on a flow-based deep neural network (DNN) framework with which two normality

metrics (logarithm likelihood and logarithm likelihood ratio) can be

calculated. With this method, only one set of normal CXRs requires labeling to

train the DNN, then the normality of any unknown CXR can be evaluated. The area

under the receiver operation characteristic curve acquired with the logarithm

likelihood ratio metric ((approx0.783)) was greater than that obtained with

the logarithm likelihood metric, and was a value comparable to those in

previous studies where other weakly supervised DNNs were implemented.

LRF-Net: Learning Local Reference Frames for 3D Local Shape Description and Matching

Angfan Zhu , Jiaqi Yang , Chen Zhao , Ke Xian , Zhiguo Cao , Xin Li

Comments: 7 pages, 9 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Machine Learning (cs.LG)

The local reference frame (LRF) acts as a critical role in 3D local shape

description and matching. However, most of existing LRFs are hand-crafted and

suffer from limited repeatability and robustness. This paper presents the first

attempt to learn an LRF via a Siamese network that needs weak supervision only.

In particular, we argue that each neighboring point in the local surface gives

a unique contribution to LRF construction and measure such contributions via

learned weights. Extensive analysis and comparative experiments on three public

datasets addressing different application scenarios have demonstrated that

LRF-Net is more repeatable and robust than several state-of-the-art LRF methods

(LRF-Net is only trained on one dataset). In addition, LRF-Net can

significantly boost the local shape description and 6-DoF pose estimation

performance when matching 3D point clouds.

NeurOpt: Neural network based optimization for building energy management and climate control

Achin Jain , Francesco Smarra , Enrico Reticcioli , Alessandro D'Innocenzo , Manfred Morari Subjects : Systems and Control (eess.SY) ; Machine Learning (cs.LG)

Model predictive control (MPC) can provide significant energy cost savings in

building operations in the form of energy-efficient control with better

occupant comfort, lower peak demand charges, and risk-free participation in

demand response. However, the engineering effort required to obtain

physics-based models of buildings for MPC is considered to be the biggest

bottleneck in making MPC scalable to real buildings. In this paper, we propose

a data-driven control algorithm based on neural networks to reduce this cost of

model identification. Our approach does not require building domain expertise

or retrofitting of the existing heating and cooling systems. We validate our

learning and control algorithms on a two-story building with 10 independently

controlled zones, located in Italy. We learn dynamical models of energy

consumption and zone temperatures with high accuracy and demonstrate energy

savings and better occupant comfort compared to the default system controller.

Zeroth-Order Algorithms for Nonconvex Minimax Problems with Improved Complexities

Zhongruo Wang , Krishnakumar Balasubramanian , Shiqian Ma , Meisam Razaviyayn Subjects : Machine Learning (stat.ML) ; Data Structures and Algorithms (cs.DS); Machine Learning (cs.LG); Optimization and Control (math.OC)

In this paper, we study zeroth-order algorithms for minimax optimization

problems that are nonconvex in one variable and strongly-concave in the other

variable. Such minimax optimization problems have attracted significant

attention lately due to their applications in modern machine learning tasks. We

first design and analyze the Zeroth-Order Gradient Descent Ascent

( exttt{ZO-GDA}) algorithm, and provide improved results compared to existing

works, in terms of oracle complexity. Next, we propose the Zeroth-Order

Gradient Descent Multi-Step Ascent ( exttt{ZO-GDMSA}) algorithm that

significantly improves the oracle complexity of exttt{ZO-GDA}. We also

provide stochastic version of exttt{ZO-GDA} and exttt{ZO-GDMSA} to handle

stochastic nonconvex minimax problems, and provide oracle complexity results.

Depth-Based Selective Blurring in Stereo Images Using Accelerated Framework

Subhayan Mukherjee , Ram Mohana Reddy Guddeti

Comments: arXiv admin note: text overlap with arXiv:2001.06967

Journal-ref: 3D Research (Springer) 5, Article number: 14 (2014)

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Machine Learning (cs.LG); Image and Video Processing (eess.IV)

We propose a hybrid method for stereo disparity estimation by combining block

and region-based stereo matching approaches. It generates dense depth maps from

disparity measurements of only 18 % image pixels (left or right). The

methodology involves segmenting pixel lightness values using fast K-Means

implementation, refining segment boundaries using morphological filtering and

connected components analysis; then determining boundaries’ disparities using

sum of absolute differences (SAD) cost function. Complete disparity maps are

reconstructed from boundaries’ disparities. We consider an application of our

method for depth-based selective blurring of non-interest regions of stereo

images, using Gaussian blur to de-focus users’ non-interest regions.

Experiments on Middlebury dataset demonstrate that our method outperforms

traditional disparity estimation approaches using SAD and normalized cross

correlation by up to 33.6 % and some recent methods by up to 6.1 %. Further,

our method is highly parallelizable using CPU and GPU framework based on Java

Thread Pool and APARAPI with speed-up of 5.8 for 250 stereo video frames (4,096

x 2,304).

When does the Tukey median work?

Banghua Zhu , Jiantao Jiao , Jacob Steinhardt Subjects : Statistics Theory (math.ST) ; Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Signal Processing (eess.SP); Machine Learning (stat.ML)

We analyze the performance of the Tukey median estimator under total

variation (TV) distance corruptions. Previous results show that under Huber’s

additive corruption model, the breakdown point is 1/3 for high-dimensional

halfspace-symmetric distributions. We show that under TV corruptions, the

breakdown point reduces to 1/4 for the same set of distributions. We also show

that a certain projection algorithm can attain the optimal breakdown point of

1/2. Both the Tukey median estimator and the projection algorithm achieve

sample complexity linear in dimension.

Weakly Supervised Temporal Action Localization Using Deep Metric Learning

Ashraful Islam , Richard J. Radke

Comments: accepted to WACV 2020

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Machine Learning (cs.LG)

Temporal action localization is an important step towards video

understanding. Most current action localization methods depend on untrimmed

videos with full temporal annotations of action instances. However, it is

expensive and time-consuming to annotate both action labels and temporal

boundaries of videos. To this end, we propose a weakly supervised temporal

action localization method that only requires video-level action instances as

supervision during training. We propose a classification module to generate

action labels for each segment in the video, and a deep metric learning module

to learn the similarity between different action instances. We jointly optimize

a balanced binary cross-entropy loss and a metric loss using a standard

backpropagation algorithm. Extensive experiments demonstrate the effectiveness

of both of these components in temporal localization. We evaluate our algorithm

on two challenging untrimmed video datasets: THUMOS14 and ActivityNet1.2. Our

approach improves the current state-of-the-art result for THUMOS14 by 6.5% mAP

at IoU threshold 0.5, and achieves competitive performance for ActivityNet1.2.

GhostImage: Perception Domain Attacks against Vision-based Object Classification Systems

Yanmao Man , Ming Li , Ryan Gerdes Subjects : Cryptography and Security (cs.CR) ; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

In vision-based object classification systems, imaging sensors perceive the

environment and then objects are detected and classified for decision-making

purposes. Vulnerabilities in the perception domain enable an attacker to inject

false data into the sensor which could lead to unsafe consequences. In this

work, we focus on camera-based systems and propose GhostImage attacks, with the

goal of either creating a fake perceived object or obfuscating the object’s

image that leads to wrong classification results. This is achieved by remotely

projecting adversarial patterns into camera-perceived images, exploiting two

common effects in optical imaging systems, namely lens flare/ghost effects, and

auto-exposure control. To improve the robustness of the attack to channel

perturbations, we generate optimal input patterns by integrating adversarial

machine learning techniques with a trained end-to-end channel model. We realize

GhostImage attacks with a projector, and conducted comprehensive experiments,

using three different image datasets, in indoor and outdoor environments, and

three different cameras. We demonstrate that GhostImage attacks are applicable

to both autonomous driving and security surveillance scenarios. Experiment

results show that, depending on the projector-camera distance, attack success

rates can reach as high as 100%.

Machine Learning for Performance-Aware Virtual Network Function Placement

Dimitrios Michael Manias , Manar Jammal , Hassan Hawilo , Abdallah Shami , Parisa Heidari , Adel Larabi , Richard Brunner

Comments: 6 pages, 6 figures, 1 table, 9 equations, 18 references, Conference

Subjects

:

Signal Processing (eess.SP)

; Machine Learning (cs.LG); Networking and Internet Architecture (cs.NI); Machine Learning (stat.ML)

With the growing demand for data connectivity, network service providers are

faced with the task of reducing their capital and operational expenses while

simultaneously improving network performance and addressing the increased

connectivity demand. Although Network Function Virtualization (NFV) has been

identified as a solution, several challenges must be addressed to ensure its

feasibility. In this paper, we address the Virtual Network Function (VNF)

placement problem by developing a machine learning decision tree model that

learns from the effective placement of the various VNF instances forming a

Service Function Chain (SFC). The model takes several performance-related

features from the network as an input and selects the placement of the various

VNF instances on network servers with the objective of minimizing the delay

between dependent VNF instances. The benefits of using machine learning are

realized by moving away from a complex mathematical modelling of the system and

towards a data-based understanding of the system. Using the Evolved Packet Core

(EPC) as a use case, we evaluate our model on different data center networks

and compare it to the BACON algorithm in terms of the delay between

interconnected components and the total delay across the SFC. Furthermore, a

time complexity analysis is performed to show the effectiveness of the model in

NFV applications.

Emergence of Pragmatics from Referential Game between Theory of Mind Agents

Luyao Yuan , Zipeng Fu , Jingyue Shen , Lu Xu , Junhong Shen , Song-Chun Zhu Subjects : Artificial Intelligence (cs.AI) ; Computation and Language (cs.CL); Machine Learning (cs.LG); Multiagent Systems (cs.MA)

Pragmatics studies how context can contribute to language meanings [1]. In

human communication, language is never interpreted out of context, and

sentences can usually convey more information than their literal meanings [2].

However, this mechanism is missing in most multi-agent systems [3, 4, 5, 6],

restricting the communication efficiency and the capability of human-agent

interaction. In this paper, we propose an algorithm, using which agents can

spontaneously learn the ability to “read between lines” without any explicit

hand-designed rules. We integrate the theory of mind (ToM) [7, 8] in a

cooperative multi-agent pedagogical situation and propose an adaptive

reinforcement learning (RL) algorithm to develop a communication protocol. ToM

is a profound cognitive science concept, claiming that people regularly reason

about other’s mental states, including beliefs, goals, and intentions, to

obtain performance advantage in competition, cooperation or coalition. With

this ability, agents consider language as not only messages but also rational

acts reflecting others’ hidden states. Our experiments demonstrate the

advantage of pragmatic protocols over non-pragmatic protocols. We also show the

teaching complexity following the pragmatic protocol empirically approximates

to recursive teaching dimension (RTD).

EMOPAIN Challenge 2020: Multimodal Pain Evaluation from Facial and Bodily Expressions

Nadia Berthouze , Michel Valstar , Amanda Williams , Joy Egede , Temitayo Olugbade , Chongyang Wang , Hongyin Meng , Min Aung , Nicholas Lane , Siyang Song

Comments: 8 pages

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

The EmoPain 2020 Challenge is the first international competition aimed at

creating a uniform platform for the comparison of machine learning and

multimedia processing methods of automatic chronic pain assessment from human

expressive behaviour, and also the identification of pain-related behaviours.

The objective of the challenge is to promote research in the development of

assistive technologies that help improve the quality of life for people with

chronic pain via real-time monitoring and feedback to help manage their

condition and remain physically active. The challenge also aims to encourage

the use of the relatively underutilised, albeit vital bodily expression signals

for automatic pain and pain-related emotion recognition. This paper presents a

description of the challenge, competition guidelines, bench-marking dataset,

and the baseline systems’ architecture and performance on the three sub-tasks:

pain estimation from facial expressions, pain recognition from multimodal

movement, and protective movement behaviour detection.

S(^{2})OMGAN: Shortcut from Remote Sensing Images to Online Maps

X. Chen (1), S. Chen (1), T. Xu (1), B. Yin (1), X. Mei (2), J. Peng (2), H. Li (2) ((1) School of Computer Science, Wuhan University, Wuhan, 430072, China, (2) School of Geosciences and Info-Physics, Central South University, Changsha, 410083, China) Subjects : Image and Video Processing (eess.IV) ; Machine Learning (cs.LG); Machine Learning (stat.ML)

Traditional online maps, widely used on Internet such as Google map and Baidu

map, are rendered from vector data. Timely updating online maps from vector

data, of which the generating is time-consuming, is a difficult mission. It is

a shortcut to generate online maps in time from remote sensing images, which

can be acquired timely without vector data. However, this mission used to be

challenging or even impossible. Inspired by image-to-image translation

(img2img) techniques based on generative adversarial network (GAN), we propose

a semi-supervised structure-augmented online map GAN (S(^{2})OMGAN) model to

generate online maps directly from remote sensing images. In this model, we

designed a semi-supervised learning strategy to pre-train S(^{2})OMGAN on rich

unpaired samples and finetune it on limited paired samples in reality. We also

designed image gradient L1 loss and image gradient structure loss to generate

an online map with global topological relationship and detailed edge curves of

objects, which are important in cartography. Moreover, we propose edge

structural similarity index (ESSI) as a metric to evaluate the quality of

topological consistency between generated online maps and ground truths.

Experimental results present that S(^{2})OMGAN outperforms state-of-the-art

(SOTA) works according to mean squared error, structural similarity index and

ESSI. Also, S(^{2})OMGAN wins more approval than SOTA in the human perceptual

test on visual realism of cartography. Our work shows that S(^{2})OMGAN is

potentially a new paradigm to produce online maps. Our implementation of the

S(^{2})OMGAN is available at url{ this https URL }.

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices

Xiaolong Ma , Wei Niu , Tianyun Zhang , Sijia Liu , Fu-Ming Guo , Sheng Lin , Hongjia Li , Xiang Chen , Jian Tang , Kaisheng Ma , Bin Ren , Yanzhi Wang

Comments: arXiv admin note: text overlap with arXiv:1909.05073

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Weight pruning has been widely acknowledged as a straightforward and

effective method to eliminate redundancy in Deep Neural Networks (DNN), thereby

achieving acceleration on various platforms. However, most of the pruning

techniques are essentially trade-offs between model accuracy and regularity

which lead to impaired inference accuracy and limited on-device acceleration

performance. To solve the problem, we introduce a new sparsity dimension,

namely pattern-based sparsity that comprises pattern and connectivity sparsity,

and becoming both highly accurate and hardware friendly. With carefully

designed patterns, the proposed pruning unprecedentedly and consistently

achieves accuracy enhancement and better feature extraction ability on

different DNN structures and datasets, and our pattern-aware pruning framework

also achieves pattern library extraction, pattern selection, pattern and

connectivity pruning and weight training simultaneously. Our approach on the

new pattern-based sparsity naturally fits into compiler optimization for highly

efficient DNN execution on mobile platforms. To the best of our knowledge, it

is the first time that mobile devices achieve real-time inference for the

large-scale DNN models thanks to the unique spatial property of pattern-based

sparsity and the help of the code generation capability of compilers.

Towards Comparability in Non-Intrusive Load Monitoring: On Data and Performance Evaluation

Christoph Klemenjak , Stephen Makonin , Wilfried Elmenreich Subjects : Signal Processing (eess.SP) ; Machine Learning (cs.LG)

Non-Intrusive Load Monitoring (NILM) comprises of a set of techniques that

provide insights into the energy consumption of households and industrial

facilities. Latest contributions show significant improvements in terms of

accuracy and generalisation abilities. Despite all progress made concerning

disaggregation techniques, performance evaluation and comparability remains an

open research question. The lack of standardisation and consensus on evaluation

procedures makes reproducibility and comparability extremely difficult. In this

paper, we draw attention to comparability in NILM with a focus on highlighting

the considerable differences amongst common energy datasets used to test the

performance of algorithms. We divide discussion on comparability into data

aspects, performance metrics, and give a close view on evaluation processes.

Detailed information on pre-processing as well as data cleaning methods, the

importance of unified performance reporting, and the need for complexity

measures in load disaggregation are found to be the most urgent issues in

NILM-related research. In addition, our evaluation suggests that datasets

should be chosen carefully. We conclude by formulating suggestions for future

work to enhance comparability.

Information Theory

On the Capacity of Waveform Channels Under Square-Law Detection of Time-Limited Signals

Amir Tasbihi , Frank R. Kschischang

Comments: Submitted to IEEE Trans. Inf. Theory, January 8, 2020

Subjects

:

Information Theory (cs.IT)

Capacity bounds for waveform channels under square-law detection of

time-limited complex-valued signals are derived. The upper bound is the

capacity of the channel under (complex-valued) coherent detection. The lower

bound is one bit less, per dimension, than the upper bound.

Optimal Multistage Group Testing Algorithm for 3 Defectives

Ilya Vorobyev

Comments: 8 pages

Subjects

:

Information Theory (cs.IT)

; Combinatorics (math.CO)

Group testing is a well-known search problem that consists in detecting of

(s) defective members of a set of (t) samples by carrying out tests on properly

chosen subsets of samples. In classical group testing the goal is to find all

defective elements by using the minimal possible number of tests in the worst

case. In this work, a multistage group testing problem is considered. Our goal

is to construct a multistage search procedure, having asymptotically the same

number of tests as an adaptive one. We propose a new approach to designing

multistage algorithms, which allows us to construct a 5-stage algorithm for

finding 3 defectives with the optimal number (3log_2t(1+o(1))) of tests.

Construction of Rate (n-1)/n Non-Binary LDPC Convolutional Codes via Difference Triangle Sets

Gianira N. Alfarano , Julia Lieb , Joachim Rosenthal

Comments: The paper was submitted to ISIT 2020

Subjects

:

Information Theory (cs.IT)

; Combinatorics (math.CO)

This paper provides a construction of non-binary LDPC convolutional codes,

which generalizes the work of Robinson and Bernstein. The sets of integers

forming an ((n-1,w))-difference triangle set are used as supports of the

columns of rate ((n-1)/n) convolutional codes. If the field size is large

enough, the Tanner graph associated to the sliding parity-check matrix of the

code is free from (4) and (6)-cycles not satisfying the full rank condition.

This is important for improving the performance of a code and avoiding the

presence of low-weight codewords and absorbing sets. The parameters of the

convolutional code are shown to be determined by the parameters of the

underlying difference triangle set. In particular, the free distance of the

code is related to (w) and the degree of the code is linked to the “scope” of

the difference triangle set. Hence, the problem of finding families of

difference triangle set with minimum scope is equivalent to find convolutional

codes with small degree.

On the Performance of Quickest Detection Spectrum Sensing: The Case of Cumulative Sum

Ahmed Badawy , Ahmed El Shafie , Tamer Khattab

Comments: This paper is accepted for publication in IEEE Communication Letters Jan 2020

Subjects

:

Information Theory (cs.IT)

; Networking and Internet Architecture (cs.NI)

Quickest change detection (QCD) is a fundamental problem in many

applications. Given a sequence of measurements that exhibits two different

distributions around a certain flipping point, the goal is to detect the change

in distribution around the flipping point as quickly as possible. The QCD

problem appears in many practical applications, e.g., quality control, power

system line outage detection, spectrum reuse, and resource allocation and

scheduling. In this paper, we focus on spectrum sensing as our application

since it is a critical process for proper functionality of cognitive radio

networks. Relying on the cumulative sum (CUSUM), we derive the probability of

detection and the probability of false alarm of CUSUM based spectrum sensing.

We show the correctness of our derivations using numerical simulations.

Get Rid of Suspended Animation Problem: Deep Diffusive Neural Network on Graph Semi-Supervised Classification

Jiawei Zhang

Comments: 7 pages, 6 figures

Subjects

:

Machine Learning (cs.LG)

; Artificial Intelligence (cs.AI); Information Theory (cs.IT); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

Existing graph neural networks may suffer from the “suspended animation

problem” when the model architecture goes deep. Meanwhile, for some graph

learning scenarios, e.g., nodes with text/image attributes or graphs with

long-distance node correlations, deep graph neural networks will be necessary

for effective graph representation learning. In this paper, we propose a new

graph neural network, namely DIFNET (Graph Diffusive Neural Network), for graph

representation learning and node classification. DIFNET utilizes both neural

gates and graph residual learning for node hidden state modeling, and includes

an attention mechanism for node neighborhood information diffusion. Extensive

experiments will be done in this paper to compare DIFNET against several

state-of-the-art graph neural network models. The experimental results can

illustrate both the learning performance advantages and effectiveness of

DIFNET, especially in addressing the “suspended animation problem”.

Coarse-Grain Cluster Analysis of Tensors With Application to Climate Biome Identification

Derek DeSantis , Phillip J. Wolfram , Katrina Bennett , Boian Alexandrov Subjects : Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI); Information Theory (cs.IT); Machine Learning (stat.ML)

A tensor provides a concise way to codify the interdependence of complex

data. Treating a tensor as a d-way array, each entry records the interaction

between the different indices. Clustering provides a way to parse the

complexity of the data into more readily understandable information. Clustering

methods are heavily dependent on the algorithm of choice, as well as the chosen

hyperparameters of the algorithm. However, their sensitivity to data scales is

largely unknown.

In this work, we apply the discrete wavelet transform to analyze the effects

of coarse-graining on clustering tensor data. We are particularly interested in

understanding how scale effects clustering of the Earth’s climate system. The

discrete wavelet transform allows classification of the Earth’s climate across

a multitude of spatial-temporal scales. The discrete wavelet transform is used

to produce an ensemble of classification estimates, as opposed to a single

classification. Using information theory, we discover a sub-collection of the

ensemble that span the majority of the variance observed, allowing for

efficient consensus clustering techniques that can be used to identify climate

biomes.

Physical Layer Authentication for Non-coherent Massive SIMO-Based Industrial IoT Communications

Zhifang Gu , He Chen , Pingping Xu , Yonghui Li , Branka Vucetic Subjects : Signal Processing (eess.SP) ; Cryptography and Security (cs.CR); Information Theory (cs.IT)

Achieving ultra-reliable, low-latency and secure communications is essential

for realizing the industrial Internet of Things (IIoT). Non-coherent massive

multiple-input multiple-output (MIMO) has recently been proposed as a promising

methodology to fulfill ultra-reliable and low-latency requirements. In

addition, physical layer authentication (PLA) technology is particularly

suitable for IIoT communications thanks to its low-latency attribute. A PLA

method for non-coherent massive single-input multiple-output (SIMO) IIoT

communication systems is proposed in this paper. Specifically, we first

determine the optimal embedding of the authentication information (tag) in the

message information. We then optimize the power allocation between message and

tag signal to characterize the trade-off between message and tag error

performance. Numerical results show that the proposed PLA is more accurate then

traditional methods adopting the uniform tag when the communication reliability

remains at the same level. The proposed PLA method can be effectively applied

to the non-coherent system.

欢迎加入我爱机器学习QQ14群:336582044

微信扫一扫,关注我爱机器学习公众号

微博:我爱机器学习

我来评几句
登录后评论

已发表评论数()

相关站点

热门文章