## 18ème séminaire du groupe POC

**Le VENDREDI 14 DECEMBRE 2018**

Au laboratoire LAMSADE de l’Université Dauphine

Place du Maréchal de Lattre de Tassigny

75017 PARIS Cedex

Access by car and public transportation

# "Machine Learning, Networks

and Combinatorial Optimization"

The entrance is free but please **Register **using the register tool "S'inscrire" at the top of this page

Click here to see the list of registered participants

**Warning: there are two different rooms for the talks**

**9h00-9h30 Welcome breakfast Espace 7 (close to Room A709 on the same 7th floor)**

**9h30-10h15 Room A709 (7th floor) **

## Robust clustering in social networks

### During the last decades the importance of considering data uncertainty in optimization problems has become increasingly apparent, since small fluctuations of input data may lead to comparably bad decisions in many practical problems when uncertainty is ignored. If the probability distribution of the uncertain data is not known (or cannot be estimated with sufficient accuracy), a common technique is to estimate bounds on the uncertain data (i.e., define uncertainty sets) and to identify optimal solutions that are robust against data fluctuations within these bounds. This approach leads to the robust optimization paradigm that allows to consider uncertain objectives and constraints.

### Optimization problems where only the objective is uncertain arise, for instance, prominently in the analysis of social networks. This stems from the fact that the strength of social ties (i.e., the amount of influence individuals exert on each other) or the willingness of individuals to adopt and share information can, for example, only be roughly estimated based on observations. A fundamental problem arising in social network analysis regards the identification of communities (e.g., work groups, interest groups), which can be modeled as a Dominant Set Clustering Problem which in turn leads to a Standard Quadratic Optimization Problems (StQP). Here the link strengths enter the objective while the constraints are familiar probability constraints, so that they can be considered certain. Hence we investigate data uncertainty in the objective function of StQPs, considering different uncertainty sets, and derive implications for the complexity of robust variants of the corresponding deterministic counterparts. We can show that considering data uncertainty in a StQP results in another StQP of the same complexity if ellipsoidal, spherical or boxed uncertainty sets are assumed. Moreover we discuss implications when considering polyhedral uncertainty sets, and derive rigorous bounds for this case.

### (joint work with Michael Kahr and Markus Leitner, all University of Vienna)

**10h15 - 11h00 Room A 709 (7th floor) **

## Learning Interpretable Models : Machine Learning meets Combinatorial Optimization

### In many applications of machine learning (e.g. in bank, insurance, medical diagnosis), having interpretable models is of paramount importance. Unfortunately, most machine learning models used nowadays are complex models with thousands, or even millions of parameters. We argue that compact models, whose parameters have a small domain (e.g. bounded integers) are much more human-friendly. In this, talk, we will present a some models already used by physicians and biologists. We will show that learning these models is a hard combinatorial optimization problem. We will present algorithms with guarantees.

**11h00 - 11h15 Cofee break Espace 7 (close to Room A709 on the same 7th floor)**

**11h15 - 12h00 Room A 709 (7th floor) **

**Bistra DILKINA**, University of Southern California

## Learning-Driven Algorithms for Discrete Optimization

### This talk focuses on a novel fruitful synergy between machine learning and optimization --- in particular, how ML techniques can improve the design of algorithms for Discrete Optimization, both complete algorithms such as branch and bound as well as incomplete ones such as heuristic greedy search. Branch and Bound solvers for Mixed Integer Programs (MIP) such as CPLEX, Gurobi and SCIP are used daily across different domains and industries to find solutions with optimality guarantees for NP-hard combinatorial problems. Leveraging the plethora of rich and useful data generated during the solving process, we illustrate the potential for ML in MIP on two crucial tasks in branch and bound: branching variable selection and primal heuristic selection. Empirical results show that our novel approaches can significantly improve the performance of a solver on both heterogeneous benchmark instances as well as homogeneous families of instances. In the second part of the talk, we show how to leverage a unique combination of reinforcement learning and graph embedding to infer very effective data-driven greedy strategies for solving well-studied combinatorial optimization problems on graphs such as Minimum Vertex Cover, Max Cut and Traveling Salesman. I will conclude with some new directions combining ideas from these two strands of work to introduce representation learning with MIP branch and bound solving.

**12h00-12h45 Room A 709 (7th floor) **

**Bernard GENDRON**, University of Montreal, Canada

## A Bilevel Optimization Framework Integrating Network Design and Discrete Choice Models

### In many transportation applications, a network manager has to take decisions about infrastructures and services, while taking into account the preferences of various classes of users. The latter can often be formulated with discrete choice models that allow to capture the heterogeneity of users' preferences, including (but not limited to) their preferences relative to the decisions taken by the network manager. Inspired by recent contributions where discrete choice models are integrated (through simulation) into deterministic mixed-integer programming formulations, we propose a bilevel optimization model where the leader corresponds to a network design problem (the manager's decisions) and the followers to realizations (obtained through simulation) of the discrete choice models that capture path selection of heterogeneous users. Under mild assumptions on the discrete choice models, including their relationships with the leader's decision variables, we show how to reformulate the bilevel optimization model into mixed-integer linear programming formulations. We illustrate the concepts with a flow capture model that has several applications in transportation, including the deployment of vehicle inspection facilities on a highway network and the location of bicycle sharing stations.

### (Joint work with Léonard Ryo Morin and Emma Frejinger, Université de Montréal)

**12h45-14h15 Lunch break**

You can take your lunch in several CROUS restaurants inside Dauphine,

Otherwise there are some brasseries close to Dauphine.

**14h15-15h00** **Amphitheatre A9 (third floor)**

## Learning when to use a Decomposition

### Applying a Dantzig-Wolfe decomposition to a mixed-integer program (MIP) aims at exploiting an embedded model structure and can lead to significantly stronger reformulations of the MIP. Recently, automating the process and embedding it in standard MIP solvers have been proposed, with the detection of a decomposable model structure as key element. If the detected structure reflects the (usually unknown) actual structure of the MIP well, the solver may be much faster on the reformulated model than on the original. Otherwise, the solver may completely fail. We propose a supervised learning approach to decide whether or not a reformulation should be applied, and which decomposition to choose when several are possible. Preliminary experiments with a MIP solver equipped with this knowledge show a significant performance improvement on structured instances, with little deterioration on others.

**15h00-15h45 ****Amphitheatre A9 (third floor)**

## Optimization in Social Networks

### A large number of Online Social Networks (OSNs) are nowadays available on the web, providing platforms for fast information diffusion and exchange among their users.

The great popularity and expansion of OSNs have drastically changed the landscape of communications in the cyber space. In some cases, OSNs represent one of the major news sources as well as the most effective channels for viral marketing. Indeed, many people have integrated popular online social sites into their everyday lives, and rely on them for information sharing. While this can be exploited for marking purposes, OSNs are also particularly prone to generate and spread misinformation, which can have pernicious influences on individuals or society. This raises a number of relevant problems to be dealt with by different actors that operate on a social network. In this talk we describe some optimization problems that are relevant in the study of a social network.

Motivated from viral marketing, we introduce the Generalized Least Cost Influence Problem (GLCIP), in which one is asked to identify key players in an OSN so that a certain number of users in the network are reached. We present a model that is more general than other models from the literature, and hence more likely to be suitable for real-world applications.

We introduce a mathematical formulation based on the concept of activation functions, and valid inequalities that can be used to strengthen the formulation.

Exact and heuristic solution methods are developed and compared for the problem on a large benchmark of instances. Our computational results show that the proposed approach is a viable way to solve the GLCIP and that it outperforms the state-of-the-art approaches for the previously considered relevant special cases of the GLCIP.

**15h45-16h15 Coffee break**

**16h15-17h00 ****Amphitheatre A9 (third floor)**

## Design and Control of a Public Contract with an approach combining game theory and optimization.

### In all modern economies, it is common for governments and local authorities to regularly outsource services to the market. Depending on the nature of the object or service to be purchased, the transaction can be performed as a one-time activity or give rise to a long-term relationship with the provider. This talk is about the latter case, where a public entity, denoted as Agency, outsources a service to be provided to the citizens through a contract with a Provider. We address the informative asymmetry between the two players, Agency and Provider, from a game theoretical perspective, and propose a control strategy based on the solution of an optimization problem. We show that an improved control results in an incentive, for the Provider, to provide reliable information about its own performance. We report and discuss a case study arising in local public transportation.

**17h00 Closing**

## The organizers

**Ivana LJUBIC **and **Fabio FURINI**