Witold Litwin

ACM-Fellow 2001 (1st ever in France)


Position :

Professor at University Paris 9 Dauphine

Emeritus since Sept. 1, 2014

Content that follows might not be up to date. Sorry. Life is flying.

Database Courses  (taught until the retirement)

Director of CERIA (until 2009):

CENTRE DES ETUDES ET DE RECHERCHES EN INFORMATIQUE APPLIQUEE


Member of LAMSADE (present)

Short Bio (pointing also towards full CV)

ACM-Fellow 2001 Award Ceremony (Amateur Video) Play

(DSL or Local Net & Windows Media Player and Real One Player Recommended; click on Play button without waiting for the full file download. That one may take some minutes otherwise) 

ACM Special Event to Honor Leading European Computer Scientists. Oct 8, 2009, Paris.

Amateur video features the main part of the ceremony. Wmv or svf or, or flv, allow a minute or so before the streaming begins). Prof Dame Wendy Hall, ACM President, in green jacket, greets European Recipients of major research-oriented ACM Awards. ACM VP (sorry not to retain the name) calls us by name and the original Award Citation. Sir Tony Hoare, called, regretfully, before the video begins, for his 1980 Turing Award, starts the “pas de deux”.  The two photos, 1 and 2, are derived from the video and feature eight from nine honored European ACM Fellows.  The picture misses the last one called in for technical reasons. I possibly fix it in near future. Finally, guess who folks on the final enclosed souvenirs are, P1, P2, P3, especially the blond lady with the (Andy Warhol’s to respond to some inquires) scarf.


 

Address

Prof. Witold LITWIN
Université Paris 9 Dauphine
Pl. du Mal. de Lattre de Tassigny
75016 Paris FRANCE

Witold.litwin (add @ followed by dauphine.fr)

(o) 33.(0)1.44.05.48.80 
(f) 33.(0)1.44.05.48.80 
(h) 33.(0)1.46.21.08.10


 

 

 

 

 

 

 

 

Latest publications

Witold Litwin. Relations with Stored and Inherited Attributes. Lamsade Technical Report, Mars, 2016. pdf  talk at Dauphine on June 13, 2016 pptx

The universally applied Codd’s relational model has two main constructs: a stored relation, with stored attributes only and a view, only with inherited ones. In 1992, we have proposed a single construct, able to mix both types of attributes. Examples showed the construct attractive. No one jumped however on the idea. We now revisit our proposal. We show that the relational conceptual schemes using our construct may be more faithful to reality. They may also spare to queries the customary logical navigation through join clauses. Likewise, queries may be free of complex value expressions. Both annoyances are unavoidable at present, unless hidden behind dedicated views. These are cumbersome in turn. Our construct appears the first generally practical solution to this decades’ old dilemma. Better late than never, existing DBMSs should easily accommodate it, with almost no storage and processing overhead.

Sushil Jajodia, Witold Litwin, Thomas Schwarz. On-the fly AES Decryption/Encryption for Cloud  SQL Databases, Lamsade Technical Report, June 15, 2015 & ACM Repository. pdf

We propose the client-side AES256 encryption for a cloud SQL DB. Column ciphertext may be deterministic or probabilistic. We trust the cloud DBMS for security of its run-time variables, e.g., through a moving target defense. The client may send AES key(s) with the query. These serve the on-the-fly decryption of selected ciphertext into plaintext for query evaluation. The DBMS clears the key(s) and the plaintext at the query end at latest. It may deliver ciphertext to decryption enabled clients or plaintext otherwise, e.g., to browsers/navigators. The scheme functionally offers to a cloud DBMS capabilities of a plaintext SQL DBMS. AES processing overhead appears negligible for a modern CPU, e.g., a popular Intel I5. The deterministic encryption may have no storage overhead. The probabilistic one doubles the DB storage. The scheme seems the first generally practical for an outsourced encrypted SQL DB. An implementation sufficing to practice with appears easy. An existing cloud SQL DBMS with UDF support should do. 

Recent Research Activities

1.    Clouds and Security

       We worked on this topic since 2009.  At the Center for Security of Information Systems (CSIS) at GMU and at Dauphine. With Prof. Sushil Jajodia, Director of CSIS & with Thomas Schwarz, Prof. at SCU then. Lamsade site indexes the publications. The above presented research is the current goal. It follows our 2014-15 work proposing a new homomorphic encryption scheme for queries with SQL value expressions for a cloud DB. We believe that scheme the 1st practical one for its goal. See our article in Dexa-2015 proceedings (Springer).

     Before, we worked on cloud data privacy and (encryption) key recovery using a cloud as a new tool. The former direction led to a method for controlled access to a scalable collection of data records. E.g., documents, like SharePoint does. We believe, however, our method the fastest at its time and likely still. The key recovery issue led to two proposals for key copy management. First one partitions the copy into fragments stored on a few randomly chosen cloud nodes among many. A legitimate user may easily recover the key. The attacker should illegally gain access to almost all of the candidate nodes. The event that appears  highly unlikely in practice. 

     The other method trusts a key copy to an escrow agent, encrypted by the key owner in a specific way. The particularity is that the owner may define the recovery complexity at will. The idea is that to recover the key in some practical time, e.g., 1 minute, may require then a large cloud, say 1000 nodes. The cloud uses a scalable distributed virtual data structure, see the next section. The complexity should inversely depend on the trust into the escrow service. The goal is to make an illegal recovery, e.g., by an insider, expensive and traceable enough to usually deter attacks. A legitimate recovery may cost the same. But, can still be preferred to the disaster of the key loss. A legitimate client may also know a secret info defined by the key owner and unknown to the escrow service, called discount. A discount makes the recovery faster or the cloud smaller. We feel the method practical, e.g., as a Web service. One could pay there only when the recovery is requested or as an insurance etc.

2.    Scalable Distributed Virtual Data Structures

     Research on key recovery brought us around 2012 to the proposal of a new type of data structures. We called them scalable distributed virtual data structures (SDVSs). Like a traditional data structure, an SDVS is a collection of (key, non-key) structured records. Key values are 0,1..M-1 for some large M. But, these records are not stored. Instead, they are virtual. A search dynamically creates any virtual record and matches it against the search criteria. A positive match retains the record for further processing. A negative match drops any just-created one.   

     As the name suggests, the next property of an SDVS is to typically use a large cloud. The final distinguish property is that every SDVS search carries a user-defined time limit, say R over the search time at every of its cloud nodes. The overall worst search time is then R usually as well. To meet R, a node that gets a search over some records, first estimates its throughput. If the calculus shows the node unable to meet R, it splits the key set. The split sends a part of the set, say P, basically a half, to another node, for processing of the virtual records with keys within P. The partitioning is recursive till every node estimates itself able to respects R. Presently, the hash partitioning and the range partitioning of virtual records have been studied.

     Use of SDVSs allows the user defined worst-case time limit on the requested key recovery. Smaller R usually requires a larger cloud, evidently. The cloud can be heterogeneous, i.e. with nodes having different throughputs. One may also tune the cloud size as well as the average R. One may finally make SDVS reliable, i.e., protected against node failures. See the papers for more. 

     Beyond key recovery, SDVSs constitute a new tool for the operational research. They appear 1st to allow some traditionally hard calculations to be completed within the user-defined practical time limits. The condition is the availability of a sufficiently large cloud, to spread the calculations enough. One may cite the knapsack problems, the travelling salesman problems, the combinatorial enumerations, e.g., of all the permutations… It is an open research field at present.    

3.    Knapsack Joins

     We have proposed this concept with my dear co-author in my BDA 2010 invited talk, in Toulouse, The intention was to facilitate the knapsack problems calculations over relational DBs. A k-way knapsack Q-join over numerical columns r1rk selects every tuple (t1tk) with t1+…+tkC, where C is a user defined constant. A practical query may also carry the Order By and Top K clauses. For instance, someone may search for ten sets of five real estate properties to buy from a listing of a thousand with the cumulated price tag as close as possible but not over 1M$. We showed ways to make knapsack joins more efficient than through the naïve nested loop. That one was and possibly still is, the only technique used by DBMSs at the time of writing. One may easily see that our example is then yet only a dream.

     My talk opened the conference. A 2nd invited talk closed it. It was by a nice researcher from Yahoo, New York. Amazingly, it turned out that she was investigating similar issues. The goal of her work was indeed a hypothetical Yahoo assistant for folks searching for a collection of IPhone’s accessories fitting best a given budget. I did not hear since about anything commercial, at Yahoo or elsewhere that would follow up her proposals or ours.

4.    Scalable Distributed Data Structures (SDDSs)

SDDSs are a class of data structures we have proposed with M-A Neimat and D. Schneider at HP Labs, Palo Alto, in 1992. This concept and the subsequent outcomes were apparently one “leg” of my ACM Fellow Award. SDDSs were intended for  multicomputers. This naming was coined by A. Tanebaum, in his famous book. It designated for networks of share-nothing processors (nodes) linked through high-speed network, wide-area or local, or a bus. We published the 1st SDDS termed LH* at ACM-SIGMOD 93. Soon after, HP Labs patented it. An SDDS can span over potentially any number of nodes, scaling up or down, i.e., on more or less nodes, transparently for the application. SDDS files can reach sizes and access performance impossible for more traditional file structures. For best access performance, an SDDS file should reside for processing in the distributed RAM. Performance analysis has demonstrated experimentally the search speed of up to 300 times faster than to a local (magnetic) disk. The capabilities of SDDSs open new perspectives for many applications. Today, in 2016, many SDDSs have been proposed. Some are in everyday use by billions (see below). There are many papers on the subject with thousands of references to (see Google). SDDSs are especially investigated and now operationally built as the infrastructure for P2P systems, grids and clouds: public, private, hybrid.... CERIA prototypes focused mainly on so-called today RAM Clouds and NoSQL (cloud) databases, except for SD-SQL Server (Scalable Distributed SQL Server, see the page of S. Soror). The RAM Cloud orientation is in part due to 1992 visionary lecture at Berkeley by Jim Gray. See my MS Redmond 2001 talk below or my SDDS class (in French) where one (almost) original slide of Jim Gray on “distance to data” is present. Notice also the recent RAMCloud project headed by John Ousterhout (Stanford), although its papers, surprisingly, do not cite any of ours. For further info on SDDSs – see our classes Cloud Databases.

Research on SDDSs over last decade (see the publications of CERIA Lab and of myself):

SDDS Files under Windows
SD SQL Server

LH*RSP2P, specifically designed for P2P systems (Google Talk )

            SDDSs using Algebraic Signatures
SDDSs for Pattern Matching
AS-Index
SD-Rtrees
Secure SDDSs
LH*RE : LH* with Recoverable Encryption Keys
Privacy of Outsourcing to the Cloud for Readers Selected through Client-Side Encryption
Recoverable Encryption Through Noised Secret (2012)


Selected References

 

Introduction to SDDSs (Powerpoint), (not revised for many years, sorry about, see my class in French for the updated stuff).

 

SDDS-2000 : A Prototype System for Scalable Distributed Data Structures on Windows 2000 (Powerpoint). MS Research, Redmond  & NWU, Chicago, 2001.

 

This talk introduced SDDSs to MS Research audience. As usual, most folks in the room did not bother to read the papers published before. The talk largely predated any popular industrial development we discussed. I still wonder to what extent it influenced the future Azure Table designers. I was invited by two old pals D. Lomet & Per-Ake Larson. With D. Lomet we have proposed earlier the bounded disorder (BD) file access method. See our publication list and the Web for the follow up work. With Per-Ake we had long lasting coop over dynamic and linear hash schemes.

 

On the souvenir side, after the talk I chatted with Rick Rachid. He was in charge of MS Research then.  He noticed the SDDs potential. But, he was rather focused on the prototype MS wristwatch. It is now called generally a smartwatch. He proudly displayed a Cartier-like he had. We accessed the MFST Nasdaq quote somewhere. Well, I expressed him doubts about the bright future of the invention, given the display size. Smart phones invasion (headed then by Nokia) was about to begin and these were naturally much less limited on that side. Vice versa, his nice comments, based on his experience of building his distributed OS,  did not seem sharing my enthusiasm for the bright future of SDDSs, i.e., of the cloud infrastructure as we say today. Well, today clouds are in use by billions as we all know (see also below). MS got finally so convinced to clouds that their main fan there got promoted CEO. Actually, even the name cloud was coined later by MS. Smart watches attempt to make it as smart phone accessories. No news about MS smart watches as yet, in Feb. 2016. Researcher life has sometimes amazing turns.   

 

LH* : A Scalable Distributed Data Structure (ACM-TODS)

The paper expands earlier results, reported at Sigmod Conf. It presents especially two versions of LH*: with and without so-called coordinator. The latter uses a circulating token instead. It was never implemented beyond what our paper reports (work by J. Levy, during internship at HPL). After reading the paper, compare LH* features to those of DHTs (Distributed Hash Tables). Especially, compare the messaging cost of both types of structures. The paper proposing DHTs should say something about. But didn’t. Worse, it missed any reference to LH* work. This disregard for honest scientific practice was surprising. At least, since it was coming from Berkeley U. You might wish to look also at the LH* scan operation. It was never implemented to our best knowledge. A likely reason was the lack of good programming model. Like the popular MapReduce model.  Basic MapReduce uses nevertheless a static hash for the data distribution. It’s well-known as often inconvenient. Amazon’s Elastic Map/Reduce is closer to what LH* could provide. Again, a direction for volunteers.

LH*RS a highly-available scalable distributed data structure (ACM-TODS 2005)

Presents the prototype designed by Rim Moussa (her Ph. D.) and the theory basis. Builds up on earlier Ph.D. prototypes. One by J. Karlson (Uppsala) proposing a particularly elegant variant called  LH*LH. The other by F. Bennour, called SDDS-2002, built specifically for Windows RAM files. See CERIA publication list (or the Web). Notice that in last 30 years, only a handful of French DB research papers passed the ACM-TODS publication standards. Rim’s prototype was also demonstrated at VLDB in Toronto. Early results for LH*RS were reported also in a Sigmod paper. 

RP* : A Family of Order Preserving Scalable Distributed Data Structures (VLDB-94)

See Google’s landmark paper (OSDI06) on Bigtable. You likely used that without even noticing. As billions of others do every day, to get pages from Google. Appreciate yourself the similarity/differences between the BigTable structure and RP* schemes in our paper. In the same spirit, take a peek on Azure Table, Mongo DB, PNUTS and Hive.

Needless to hide nevertheless our opinion that all these great products root in one of RP* schemes, called RP*S in our paper. None of their documentation we’re aware of refers to it however. Unlike the honest scientific practice requires. I asked Google guys why, when I gave my LH*RSP2P Google Talk, in Google Tech Talks Series, 2007). All but one did not respond. The nice one, who, by the way, is by himself among top researchers there, said: sorry, we did not spot the paper. But we’ll cite it from now on. We’re systems guys. Do not follow up DB conferences.  Well, VLDB where the paper appeared is among main DB gatherings. DB researchers besides, follow mainstream conferences outside the field, e.g. Usenix, Mascots...  So with all due respect, we do not fully subscribe to this excuse. Especially that the no-cite practice became disgracefully widespread in the last decade. Despite great tools like Google, (with its BigTable), Scholar, DBLP, Bing… you name it. Commercial R & D world seems particularly affected by the disease. But, the university research is no more exempt either. As we already mentioned for DHTs.          

High-Availability LH* Schemes with Mirroring (COOPIS-96)

As the title suggests, we proposed a fully replicated LH* scheme. Like any such scheme, it is simple to implement. The drawback is the N*100% storage cost for N replicas, as usual as well. 

LH*S : a High-availability and High-security Scalable Distributed Data Structure (IEEE-RIDE-97)

First in the series below, exploring parity codes for high-availability SDDSs. Culminating with the LH*RS in ACM-TODS referred to above. The low storage overhead makes such schemes potentially more attractive for a RAM Cloud.

LH*g : a High-availability Scalable Distributed Data Structure (CERIA & U. Linkoping Res. Rep. & IEEE TKDE Journal, Vol. 14,  Issue: 4, 2002)

LH* Schemes with Scalable Availability (IBM-Almaden Res. Rep.)

Work patented by IBM patent. Republished in IEEE tutorial book (J. Menon ed.). See the Web.

Design Issues for Scalable Availability LH* Schemes with Record Grouping (DIMACS 99, Princeton U. 1999 & IBM-Almaden Res. Rep.)

Idem.

Design Issues for Scalable Availability LH* Schemes with Record Grouping (DIMACS 99,  , Princeton U. 1999, Inv. Talk, Ms Powerpoint)

LH*RS: A High-Availability Scalable Distributed Data Structure using Reed Solomon Codes (CERIA Res. Rep. & ACM-SIGMOD 2000, Dallas)

Scalable Distributed Data Structures for High-Performance Computing (Aurora-2000, 1st Europ.  Conf. on High. Perf. Comp., Vienna 2000, Inv. Talk, Ms Powerpoint)

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment. Conf. Intl. sur les NOuvelles TEchnologies de la REpartition (NOTERE 2008), Lyon June 2008 (Keynote).

This scheme is remarkable by its one message forwarding cost for the worst case key-based addressing in an SDDS over a P2P cloud. That is, where every node is both client and server. Like, e.g., seti@home cloud. It is the least possible such cost for these clouds. We recall that LH* in general is the only to provide the least such cost, of two messages, for any SDDS over a client-server cloud, i.e., where all nodes are servers only. To compare, the same cost for DHT is (log N) for N-node cloud.  The only “competitor” to LH*RSP2P is the scheme proposed by B. Liskov & al. That one has however in addition a specific thought more sporadic messaging cost to propagate the file scalability.

The paper reports on initial results of Ph.D. research by Y. Hanafi. See the Thesis & his journal paper for more.   

An Overview of a Scalable Distributed Database System SD-SQL Server. British Natl. Conf on Data Bases (BNCOD-06), Dublin, Juillet  2006. Keynote.

This was 1st in the series devoted to SD-SQL Server: a scalable distributed cloud DBMS. The system was the subject of the Ph.D. by S. Sahri. We have presented SD-SQL Server twice to SQL Server R & D team at MS Redmond. The talks seemed well-received (as well as at BNCOD). However, the transparent table scale up  of SD-SQL Server still does not have the industrial counterpart (Feb. 2016). Neither in SQL Server nor in MS Azure SQL Database, nor elsewhere. The popular manual approach instead, by so-called industrially sharding, e.g., in Azure, is comparatively in the Middle Age.  

5.    Multidatabase Systems

MDBSs are systems for management of data in multiple autonomous databases. Work on this concept was the other “leg” of my ACM Fellow Award. We proposed it in 1980, within SIRIUS project at INRIA. The intention was to generalize adequately the database model, i.e. the concept and architecture of a DBS (Database Management System). Our “root” was the concept of a multi(data)base defined as “DB set of DBs”. Quite popular today, e.g., see the architecture of SQL Server and more at the Web.  Our first publication on the topic was at the 2-nd ACM Annual Computer Exposition in Lafayette, LA. See our publication list for the reference. To my happiness, the talk was well received. Perhaps with the exception of the alligator that run away when I stepped upon on the university garden pathway in the dark. A major benefit was also the acquaintance with a dear friend since, Peter Scheuermann, Prof. Emeritus, NWU.  

 

First prototype MDBSs, named MRDSM and SYSIDORE were designed at INRIA and presented at 1st Intl. Symp. on Distributed Databases in Paris, March 1980.  One can still (June 09) see the short description of MRDSM as one of pioneering CS tools at the Multics site maintained at MIT.

 

A central capability of an MDBS is the multidatabase definition and manipulation language, e.g. our MSQL for the relational data. This capability lacked to DBSs. Today, research on the multidatabase systems is among most popular topics. Most of DBSs silently became MDBSs, e.g., MsAccess, SQL-Server, Sybase, Interbase, Oracle, MySQL, even DB2 (the latest among the converted ones). Postgres resisted up to now (Feb. 2016). The industrial multidatabase capabilities are still rather limited as compared to those known at the research level. See, e.g., the capabilities proposed by MSQL language. See our journal papers on it or M. Rusinkiewicz & al book. Many research issues remain open. Techniques for MDBSs apply also generally to Meta-search engines on Internet. E.g., Rob's SearchEngineZ interfacing dozens of search engines. Or the MetaCrawler that searches Google, Yahoo, Bing an others. Also - EDreams, Google Shopping, Comparateur Obsèques i.e., Funeral Cost Comparator (I get this and similar friendly spam daily since my retirement started) … 

   

There is a very large literature on MDBSs. E.g., Google just displayed the estimate of 237K entries for my “multidatabase” keyword. If interested & French understanding, start perhaps with my MDBS classes.

 

Finally, today, the utility of multidatabase systems is obvious. It wasn’t so, by far, when we started to talk about the idea. The dominant trend was that of a single (integrated) DB. Possibly very large whenever needed. This was actually the motivation for the name of the well-known VLDB conference series. Also, there was a paper by Tom Steel, I believe, conjecturing the construction of a DBMS managing worldwide data as a (single) DB in a couple of years. Besides, we had heat discussions at DB conferences. To illustrate how dominant the single DB idea was, to aim at by DBMS designers, let me recall one anecdote.  I was on a plane to Cannes with one of the DB arena friends now, to attend VLDB-81 there. He was already and still is, among top researches in our field. When I asked for opinion about the idea, his response was straight: “Why should we bother about managing multiple DBs if we can manage only one”.  That was the end of the discussion, for the flight and until today.

 

6.    Other Research

Our research in 70-80-thies on virtual and especially linear hashing is sufficiently known to not be further recalled. Perhaps, except for an anecdote. Our 1st ever paper on it was a Note in highly prestigious Comptes Rendus de French Académie des Sciènces. To my stunning surprise, the Académie changed my original wording “Hash-Coding” in the title into “Codage Découpé”. Everyone concerned in France knew the meaning of the former, as there was no official French substitute. In contrast, no one knew the latter. The Académie just coined it because of my Note, in their role of the Guardian of French language. To make the link, they left however the “hash-coding” in the abstract. I wondered why they did not use therefore the existing, although not yet used for the computers, word “hachage”. The response was: no concept of coding in that one. Anyhow, the story seems unique & I’ve never seen the codage découpé applied since.

Now, let me recall some topics I also investigated 40-30 years ago. I believe they merit more attention or at least citations they’ve got since.

Structural Numbers

The concept was proposed as algebra for formal analysis of electrical circuit to Chinese Academy of Sciences by K.T. Wang in 1934. Had some follow up there. That dried out during the Chinese tumult. The topic got rediscovered in 1950 in US by R.J. Duffin and followed up by a few, especially S. Bellert in 1962. Who became in 1969 my Magister Eng. Thesis advisor at Warsaw Polytechnic School. I continued the research at INRIA, ending up with INRIA Scientific Note 7 in 1972. These were nicely edited, highly selective publications, at most one per month. The caveat was they had to be in French.

A structural number basically represents a graph. Any graph, not only an electrical circuit. Algebraic operations operate upon, in a particularly simple way. Enumerate all the circuits, trees, merge/cut edges… The Note proposes the first to my knowledge implementation of these operations that extended to hypergraphs as well. We apply all this to generate electric circuit parameter formulae, especially for impedance. The Thesis described also my prototype. Little outdated, as stored on IBM punched cards.

The graph manipulations characterize many domains. Our results were a tip of an iceberg, oriented towards our specific application. Nevertheless, as a generic graph management tool, they seemed at least competitive to the usual stuff. Despite, about nothing happened since. I could not spot any citation. One likely reason: the Note is invisible to main search engines. I guess publishing in French did not help. Also, the keyword structural numbers seems highly popular with Civil Engineering in a different sense. These references appear naturally first.

More generally, I noticed only a handful of related papers. One was a 1972 paper from India on active electrical circuits. Then, a couple of Polish ones appeared in 80-90s. They applied, manually, the structural numbers to the mechanical engineering. So, a lot of interesting work apparently still remains.

Digital Processing of Phonocardiograms

Phonocardiogram (PCG) is the recording of heart sound (murmur). The manual signal interpretation by the physician may diagnose grave illnesses. Like for ECG. However, the former seems less popular then the latter. Probably since the manual processing is lengthily. My goal was to help it through digital filtering & automatic diagnosis. Using what was then called pattern recognition & now machine learning, usually.

My work constituted the Doctorat en Informatique in in INRIA. It was published later in renowned Medical Informatics journals (search the publication list pointed to below). I demonstrated the prototype validating both goals. The signal processing introduced to PCG processing the interactive vocoding and FFT-based digital filtering (FFT means Fast Fourier Transform, we recall). The automatic classification through clustering used Benzecri’s factorial analysis. I applied the apparatus to two dangerous heart anomalies: aortic valvular stenosis and subaortic idiopathic stenosis. The diagnosis took a split of a second, instead of hours of strain to the physician eyes. 

My English paper has 3 citations at Scholar. One of those is also at MS Academic. I thought this result regretfully showing about no interest in the field. I was dead wrong. While doing this update, found dozens of follow ups. Including, US, Japan, an entire book & a 1995 survey at NIH site, cited itself over 200 times. The effort even included some French work not far from Paris around 2012. All folks continue along the path of vocoding & FFT filtering towards some classification algorithm. Why all but three avoided then citing us? Feel free to question the authors.  In the meantime, the rules of scientific honesty we’re raised with seem again the past.

Still, it did not seem that the bulk made it into the commercial world. Perhaps the current interest in Internet of Things will help, e.g., with distant patient monitoring. On anecdotic side, in early 80ties I got a letter from HP. Kindly informing me about a brand new interactive FFT-based signal filtering & analysis hardware & software box. The letter was allegedly informing me given my work in the domain. This was no more for over ten years. I understood HP somehow acknowledged the parenthood.  Anyhow, perhaps HP still sells the babies. Clustering and more data analysis would be a useful addition if not done already.   

Value Dates

A value date is the concept loved by banks, in routine use. It is a time when a transaction must commits, neither earlier nor later. For instance, your check deposit through an ATM becomes usually valid precisely at 6 pm same business day. That is the time when the credit shows up at your account and you can start spending it.

Our papers on this topic adapted the concept to the DB transaction management. It looked particularly attractive for large distributed ones. Including those reaching planetary scale, e.g., in Google’s Spanner. As for banks, the value date was giving a predictable time for every sub-transaction to end up and even to commit synchronously although in isolation. The mechanism we still believe better than the popular eventual consistency, without any such limit.

This work got observed by DB research and had a follow up. Not enough for the concept to enter the DB practice, to our knowledge. We believe, it still might and should. One reason was perhaps confusion with so called real-time transactions, proposed by our old pal Hector Garcia Molina & al. That transaction also had a time limit, but was also obviously wished to commit at earliest. We had the pleasure to match both ideas as early as in 1988, when Hector invited me for the talk at Princeton. It seems a number of folks working later on transactions, missed this important difference.   

For anecdote, I had a chance to have outstanding co-authors. My gentle initial, H. Tirri, became later the boss of Nokia Research, Palo Alto. Another friend, Y. Breitbart, also ACM Fellow, regretfully already passed away. Finally, Avi Silberschatz, Prof. at Yale, when happens to be back from Israel, does not need any introduction.  

 

Computer Life

This work had for goal future organization of Internet. The proposal was to model it upon our own. The result would be an Internet of Beings, in modern wording. Beings should have their own life, could ultimately be aware of their existence and perhaps even imagine ours. We could appear them as God(s). Amazingly, the construction was recursive. Ourselves, we could be Beings in computers of our God(s) and so on. 

One of the papers became an invited talk at Usenix. Another - became keynote for InfoJapan 90 (Information Technology Harmonizing With Society). The latter conference had another keynote. That is how I made acquaintance with Steve Jobs. His exciting talk was about NEXT. Soon after, NEXT was however a thing of the past.  

I met an unexpected number of colleagues who liked it. The idea had thus a nice follow up. In turn, I liked the cooperation with the coauthors. Nick Roussopoulos became since an ACM Fellow. Gio Wiederhold retired from Stanford, but only officially. Not surprisingly, the practical side of this research is yet to come. The astonishing part was that I could not find citations of the InfoJapan paper. Only the initial UMIACS Research Report got up to date nine citations on Scholar. It is unfortunately the only visible at the Web. I manage to locate the InfoJapan Proceedings only off-line in some US University libraries and the Congress one. AmazonStudent offers the book for incredible price of $500 (Feb. 2016). All text in the book is under the copyrights. I guess all this explains that. I found recently my text for InfoJapan prepared before the published one, after the reviewer comments. Here is at least that one. 

Amazingly, the 1st level of our model recursivity, i.e., that we are perhaps ourselves in some computer(s), appeared serious to astronomers. See there and elsewhere on the Web. Understandably, the proposers do not seem aware of our work. All things considered, again I believe that a lot of interesting work still lies ahead.

Relations with Inherited Attributes

Codd’s relational model has two types of relations. A base relation with the stored attributes only. A view has only the inherited ones, from some base relations or views. This dichotomy may appear awkward. A single concept of a relation with stored and/or inherited attributes suffices. The popular wisdom says indeed: who/what can more, can less.

We made this proposal and analyzed some consequences for an RDB design and use in this 1992 HPL Report. We planned it for a conference. This never happened. I taught the idea, in French, for a couple of years. Besides, for 24 years now, no one picked up it for a follow up, to my best knowledge. We believe our proposal still worth it. So jumped on finally myself. As you could realize, reading the top of my page.

Publication List  (see also CERIA-publications where there are some pdf’s)