As I read the article (posted on this site by dm14) ” http://www.wired.com/gadgetlab/2012/08/apple-amazon-mat-honan-hacking/ “, I could not help but read it from a viewpoint of “composition of independent systems”. One of the core themes of the article is “in a system with many participating entities, each entity should be aware of the side effects of another entity’s work”. (Apple tech did a crappy job is just one part of the problem, and in this blog, we raise our eyebrows about it but do not engage it).
Apple, and amazon are part of an eco-system. They gather information about an individual. However, they do not exchange the pieces of information on which they centralized their system accesses. In short –> there is no data-flow analysis being done between the security setups of amazon/apple/….which is exactly what the hacker did, and exploited the system.
Let us examine a few computer science fields (that I am familiar with) where the above composition problem exists.
1) Composition of optimizations in a compiler phase (not a success story):
if apple is ‘partial redundancy elimination’ and amazon is ‘register allocation’ (its technically not an optimization phase, but, for practical purposes, a poor register allocator does not mean that the program will not run, and hence we can consider regalloc to be an optimization phase), then if PRE moves an operation to way earlier in the program then it is increasing register pressure, and this might lead to a poor register allocation by the register allocator. These two stages can simply be not combined. There is no perfect way invented yet. Dr. Keith Cooper, thus says, “there is nothing optimal about compiler optimizations”. There is an improvement in some part of the program but that optimization does not always produce optimal code. (strategies currently being explored: machine learning techniques, brute force strategy of running PRE then reg_alloc or the other way around. Eventually register allocation has to be done, but you got the idea).
2) Composition of communication layers on exascale systems (a success story):
Currently my programs on Jaguar supercomputer at ORNL work on the following software stack:
CAF 2.0 program (my program thinks it has 2GB RAM per core)
|
CAF 2.0 runtime (CAF runtime thinks it has 2GB RAM per core)
|
GASNET communication layer (communication layer needs to pin memory for buffers; it toys with the idea of having 2GB RAM per core)
|
GEMINI (uGNI) hardware abstraction layer
|
GEMINI hardware (the actual interconnect)
Unless each layer is aware of the memory consumption of another layer, we are sure to hit “out of memory” (dreaded OOM killer; we encounter it lot more times than one can imagine). However, there are some success stories here which claim that each layer exposes a sort of pinnable_ratio (which is the amount of memory a layer can occupy to the total amount of memory) which helps.
Conclusion: A centralized data-flow analysis between sites such as apple/google/amazon is needed. This authority need not be the hacker.
I agree, and I think this problem is deeply related to the major methodology of engineering. Since the the major task of engineering and computer science is dealing with complexity, composition is actually a good idea, but everything has pros and cons, so I think sometimes we might need to attack each problem using different techniques.
However, if we have centralized data-flow analysis, then privacy is another problem.