Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Tag - cybersecurity

Entries feed - Comments feed

Tuesday, March 18 2014

Nginx buffer overflow

A buffer overflow has been discovered in recent versions of the HTTP server Nginx.

Hacker News user jmnicolas pondered out loud: “I wonder if this discovery is a result of OpenBSD switching its focus from Apache to Nginx?”

It took me one minute to understand what ey meant. I was actually wondering why OpenBSD would be putting buffer overflows in Nginx, now that they are done putting buffer overflows in Apache. I mean, surely there is room for one more buffer overflow in Apache?


The analysis of what this means about this industry and/or me is left as an exercise to the reader.

Sunday, February 23 2014

An interesting SSL implementation bug: CVE-2013-5914

SSL in the news

SSL is a protocol for point-to-point confidential and authenticated communication over an insecure medium. It is the protocol behind HTTPS, among many other uses. In an Internet-connected system, the SSL implementation stands at the frontier between the system and the hostile outside world. For this reason, SSL implementation flaws are a prime target for attacks.

An ordinary bug

A rather banal SSL implementation bug was revealed over the weekend. A duplicated line in a crucial, if ordinary-looking, sequence of validation operations means that some of the validation steps are not taken:

    ...
    if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
        goto fail;
        goto fail; // <-------------------------------------------------
    if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
        goto fail;
    ...
fail:
    SSLFreeBuffer(&signedHashes);
    SSLFreeBuffer(&hashCtx);
    return err;
}

Note that the second consecutive goto fail is executed with variable err containing the result of the last validation operation, 0, indicating success. This value is returned as-is by the function; the caller is therefore mislead into believing that the validation succeeded, despite some of the validation steps not having been executed.

Because C lacks an exception mechanism, the above is an often-seen programming pattern. The programming style here can hardly be blamed: this is how the best C programs are written. Except of course for the extraneous goto.

The SSL implementation, and thus the bug in question, are found in tens of millions of iOS devices, with a few additional Mac OS X computers on top of that. The scope of the security problem caused by this bug, and the obviousness of the issue when pointed out, have together lead to much commentary on Twitter and other popular discussion forums.

So reaction. Very noise

“I would never have had this problem because I know that goto is bad”, some commenters claim. I wish I was caricaturing, but I am unfortunately only paraphrasing and combining several public reactions into one.


“I would never have had this problem because I use braces”, essentially state others. Certainly, the root cause of the problem must have been that the developer who introduced the bug was confused about the meaning of if (c) stmt1; stmt2;. Just look how ey indented it!

These two often-heard remarks strongly suggest to use brace-less control flow or the presence of goto as predictors of defects in C code. I am sure this will be a fruitful research direction. Someone from the software engineering academic community should look into it.


“Just adding a single line of code can bring a system to its knees”, reminds Arie van Deursen. True, true, an important lesson there. We should all insert the following line somewhere in our respective codebases from time to time and take a good look at the effect it has, in remembrance.

(


It is Ariane 5 all over again! Worse, instead of just the makers of formal verification systems, everyone seems to have a scheme to prevent the problem they already know is there.

An interesting bug in a different SSL implementation

The problem in most of the armchair analysis that has been going on on the Internet lies in the two following questions: how many more bugs in security-critical C code does the proposed remedy (be it more braces, fewer gotos, …) find? How many time-costly, tedious, soul-crushingly boring false positives does it uncover?

Since commenters have been generalizing on the basis of a population of one sample, I feel no shame in presenting my own single example, raising the total number of scrutinized bugs to two. Note that, for us to make statistically sound arguments, we will eventually need to extend the discussion to examples of correct code that we do not wish to change.

Until then, here is a funny SSL implementation bug. It is characterized as follows, in a version of PolarSSL 1.1.8 that colleagues and I have been modifying.

Screenshot 1: an alarm in our tis_recv() function?

CVE-2013-5914-1.png

PolarSSL expects the program in which it is incorporated to provide it with a function that receives data from the outside world and writes it to a memory buffer. We have made such a function, baptized it tis_recv, and we have set up PolarSSL to use it.

The function tis_recv takes three arguments. The first one is a context argument in case someone needs one (our function ignores this argument). Second is a pointer to the buffer where the data is to be written, then third is the maximum size the function is allowed to write there.

We have specified our function tis_recv thus:

/*@ 
  requires recv_r1: \valid(output+(0..output_len-1)) ;
  requires recv_r2: output_len > 0 ; 
*/
int tis_recv(void* p, unsigned char * output, size_t output_len);

The screenshot indicates on the bottom right that the pre-condition recv_r1, which states that the argument output points to a buffer large enough for output_len characters, is not verified. How could this be? Surely a false positive… Or someone is calling the function with the wrong arguments. It does not look like the problem is in our function.

The GUI informs us that the function tis_recv is called in one place only, and that place is inside ssl_fetch_input(). It is called through a function pointer stored inside a member of a struct accessed through a pointer. The GUI tells us that we can mentally substitute ssl->f_recv(..) with tis_recv(...).

Screenshot 2: a wrong third argument

The GUI tells us that the buffer that PolarSSL passes to tis_recv() has size 16KiB-ish (not pictured), and that the variable len passed as third argument appears to take any of the values of its size_t type (pictured in the bottom panel below).

CVE-2013-5914-2.png

Screenshot 3: jumping back to the origin of the value

We inquire where the value of variable len was last set, and the GUI tells us it is at the yellow line just above the function call (pictured, in the middle panel). Well, hum, yes, we could have noticed that, but it was faster to click.

CVE-2013-5914-3.png

Screenshot 4: argument nb_want is too large

The value of len is computed from ssl_fetch_input()'s argument nb_want, and the value of nb_want appears to be too large, 65540, for the size of the buffer that the GUI tells us we have (in the screenshot below, the values computed as possible for nb_want are displayed in the bottom panel).

CVE-2013-5914-4.png

Screenshot 5: dangerous values come from caller ssl_read_record()

A new possibility offered by the Frama-C version I am using that may not even(*) be available in the latest release Fluorine is to observe, in the bottom panel, which call-sites and originating call-stacks cause which values to occur in a variable. Here, the GUI shows that nb_want appears to be 65540 when called from function ssl_read_record() at line 1178 in file ssl_tls.c of PolarSSL. This means we can continue the investigation there. In contrast, the value of nb_want can only be 5 when ssl_fetch_input() is called from ssl_parse_client_key_exchange(), so there is no need to look at that function: it is definitely not part of this problem.

(*) I don't remember. It has been a long time, has it not?

CVE-2013-5914-5.png

Screenshot 6: the seemingly too large argument is ssl->in_msglen

CVE-2013-5914-6.png

The problem is that ssl->in_msglen is too large here. But where does it come from?

Screenshot 7:

ssl->in_msglen has been computed from two bytes sent by the network (bad, bad network). But the value of ssl->in_msglen is supposed to have been validated before being used. This is what the lines between the obtention of the value and its use are supposed to do.

CVE-2013-5914-7.png

Screenshot 8:

CVE-2013-5914-8.png

The value of ssl->in_msglen is validated when ssl->minor_ver is 0, and it is validated when ssl->minor_ver is 1. But according to the GUI, ssl->minor_ver can be any of 0, 1 or 2.

Explanation

At this point it is only a matter of confirming that the call to ssl_read_record() can indeed be reached with ssl->minor_ver set to 2. This is where one switches to existential mode, possibly crafting a malicious message, or simply building a convincing argument that values can converge here to cause bad things and send it to the official maintainer .

When I said that this was a modified PolarSSL 1.1.8 we were analyzing, I cheated a little. This is a 1.1.8 version in which I have re-introduced a bug that was there from 1.0.0 to 1.1.7. The principal maintainer of PolarSSL suggests to fix the bug by replacing == SSL_MINOR_VERSION_1 by >= SSL_MINOR_VERSION_1.

Conclusion

We have seen a second example of a real bug that can occur in an SSL implementation. Understandably in the current context, there has been much speculation over the last 48 hours on the innocence of the bug in Apple's implementation. Might it be a voluntarily inserted backdoor? Is the NSA behind this bug? (I link here to John Gruber's post because he writes clearly, but the same question is raised all over the Internet).

Allow me to put it this way: if the Apple SSL bug is a trick from the NSA, then you US citizens are lucky. Our spy agency in Europe is so much better that it does not even have a name you have heard before, and it is able to plant bugs where the buffer overflow leading to arbitrary code execution is three function calls away from the actual bug. The bug from our spy agency is so deniable that the code actually used to be fine when there were only two minor revisions of the SSL protocol. The backdoors from your spy agency are so lame that the Internet has opinions about them.


Real bugs come in all sizes and shapes. That one mistake with security implication also causes easily detectable unreachable code or wrongly-indented lines does not mean that all security flaws will be detected so easily, and that plenty of quirky but functionally correct code will not be wrongly flagged.

Speaking of varied bugs, “what about PolarSSL version 1.1.8 without the manually re-inserted bug from CVE-2013-5914?”, I hear you ask. This will be the subject of another blog post.

Acknowledgments

Philippe Herrmann was first to use Frama-C to find a security bug in PolarSSL (Philippe's bug was a denial of service in versions up to 1.1.1, fixed in 1.1.2). Anne Pacalet and Paul Bakker have helped me with various aspects of PolarSSL's verification, including but not limited to the bug described in this post. Twitter users aloria, thegrugq, and johnregehr provided comments on drafts of this post. And finally, the Internet made this post possible by being itself.

Friday, January 24 2014

Bear-joke security is dead

Likely, you have heard this one before:

Two campers are surprised by an angry bear. One of them starts putting on eir running shoes. Surprised, the other exclaims “What are you doing, Alex? You can't outrun a bear!”

To which Alex replies: “I don't have to outrun the bear. I only have to outrun you.”

This joke used to be popular with security experts, who employed it as a metaphor. I only ever heard it in that context, the first time as a student in the late nineties. The bear joke was still used for this purpose in 2008.


You may also have heard that there is a new bear in town, and the new bear can take both Alex and eir friend, with or without running shoes. Also, the new bear could literally(*) eat a horse. And the bear is part of an organized network of hungry bears with walkie-talkies and sniper guns.

If your approach to security was based on bear jokes with dubious morals, now must really suck. Andy Green's blog post “Cryptography may not be dead, but it is on life support” is representative of the change. One of the quote he takes from Schneier's talk is:

Most of how the NSA deals with cryptography is by getting around it … They exploit bad implementations—we have lots of those.

Yes, we do, but we don't have to use them.


Here is to 2014 being the year of reliable cryptography implementations that cannot be circumvented through defects.


(*) “literally” within the confines of this metaphor

Monday, October 21 2013

Bruce Dawson on compiler bugs

Bruce Dawson has written a superb blog post on a Visual C++ compiler bug (now fixed), covering every aspect an essay on compiler bugs should cover.

I really like one section, that I am going to quote in full:

Security

In these paranoid days of the NSA subverting every computer system available every bug is a possible security flaw. This bug seems curiously commonplace – why wasn’t it discovered when building Windows or other 64-bit products? The most likely explanation is chance and happenstance, but compiler bugs can be used to make innocuous source code do surprising things. Maybe this bug is used as part of a back door to let code execution go in a direction that the source code says is impossible. This happened accidentally on the Xbox 360 where the low 32 bits of r0 were checked and then all 64 bits were used. This inconsistency allowed a hypervisor privilege escalation that led to arbitrary code execution.

This bug is probably not security related, but security is another reason to ensure that compiler bugs get fixed.


Bruce's post also illustrates the “most compiler bugs aren't” problem and provides a step-by-step tutorial on manual bug reduction. Regarding the latter, my reaction to Bruce's post was to bring up automatic testcase reduction with “delta debugging”, but Gravatar user jrrr beat me to it.

Frama-C contributes to making automatic testcase reduction work in practice for C compiler bugs by providing a reference C implementation that detects the undefined behaviors that could make the (original or reduced) bug reports invalid. We also use delta debugging internally to simplify our own reports of bugs in Frama-C. The next post in this blog will illustrate this use of delta debugging on a concrete example.

Monday, September 2 2013

The case for formal verification of existing software

Perry E. Metzger takes a look at formal verification. This is good stuff; there is a lot to agree with here.

However, agreeing with Perry's post alone would not make a very interesting counterpoint. If agreeing was the only thing I intended to do, I might even not have written this post. Instead I intended to add, and I apologize in advance for the predictability of my views, that while creating formally verified software from scratch is useful, verifying existing software is useful too.

Yes, formally verified software written from scratch can now be large enough in scope to be a compiler or a microkernel, but when verifying existing software, we can tackle the problem from the other end: we can pick any useful software component, and verify that. We can pick a software component so useful that it is already used by millions. If we succeed in verifying it, we have put formally verified software in the hands of millions of satisfied users. Transparently.


Take the example of the SSL implementation I am taking a couple of weeks to finish massaging through Frama-C. It is not as wide in scope as Quark, seL4 or CompCert. Neither am I aiming for the same kind of functional correctness as these projects are: I am only verifying the absence of undefined behaviors in the component, and verifying the functional dependencies of the cryptographic primitives(*).

But PolarSSL is useful. Plus, by its very nature, it is exposed to security attacks (SSL is the protocol behind HTTPS). And the former three examples are full-blown research projects, in contrast to my single person.month effort.

The bonus is that the formally verified PolarSSL can be compiled and embedded with all the same compilers and into all the same firmwares as the earlier non-verified version. It is the same as the non-verified version except maybe for a couple of bugfixes, and the confidence that for an identified usage pattern, no more undefined behavior bugs will ever need to be fixed.

All in all, the formal verification of existing code, despite its differences from its “from scratch” counterpart, has too become a force to be reckoned with.


(*) translation: I use Frama-C option -deps and I compare the result to what I expected


Acknowledgement: I got the link to Perry E. Metzger's post through Toby Murray.

Wednesday, June 19 2013

Microsoft's bug bounty program

I like Robert Graham's analysis on Microsoft's new bug bounty program.


I would never have thought of selling vulnerabilities to the NSA (but then, I am not American and not a security researcher). Does the NSA not employ qualified people to look for vulnerabilities as their day job? Is that not like trying to sell a loaf of bread to a company whose business is to make bread?


Sometimes, you have a really good loaf of bread, but still… Regardless of whether the NSA already owns your particular loaf of bread, and independently of the payment-by-carrot-or-stick discussion, you are a competitor, not a provider.

Monday, May 20 2013

Attack by Compiler

The title of this post, “Attack by Compiler”, has been at the back of my mind for several weeks. It started with a comment by jduck on a post earlier this year. The post's topic, the practical undefinedness of reading from uninitialized memory, and jduck's comment, awakened memories from a 2008 incident with the random number generator in OpenSSL.

As I am writing this, if I google “attack by compiler”, the first page of results include the classic essay Reflections on Trusting Trust by Ken Thompson, Wikipedia's definition of a backdoor in computing, an article by David A. Wheeler for countering the attack described by Thompson, a commentary by Bruce Schneier on Wheeler's article, and a Linux Journal article by David Maynor on the practicality of the attack described by Thompson on the widespread GNU/Linux platform.


This post is about a slightly different idea.

Initial conditions: trustworthy compiler, peer-reviewed changes

Suppose that we start with a trustworthy compiler, widely distributed in both source and binary form. Some people are in the position to make changes to the compiler's source code and could, like Thompson in his essay, attempt to insert a backdoor in the compiler itself.

But now, each change is scrutinized by innumerable witnesses from the Open-Source community. I say “witnesses”, in fact they are mostly students, but we will probably be forced to assume that they can't all be mischievous.

The attackers could try to obfuscate the backdoor as they insert it, but the problem is that some of the witnesses are bound to possess this character trait that the less they understand a change, the more they investigate it. Furthermore, once these witnesses have uncovered the truth, loss of credibility will ensue for the person who tried to sneak the backdoor in. This person will lose eir commit privilege to the compiler's sources, and people will recover untainted compilers in source and binary form from their archives. This kind of approach is risky and may only result in a temporary advantage—which may still be enough.

The underhanded approach

The 2013 edition of the Underhanded C Contest is under way. The contest defines underhanded code as:

code that is as readable, clear, innocent and straightforward as possible, and yet [fails] to perform at its apparent function

Underhandedness is exactly what an attacker with commit access to the source code of the widely used compiler should aim for. If the commit is underhanded enough, the committer may not only enjoy full deniability, but ey may obtain that the incriminating change stays in ulterior versions of the compiler, as a “good” change. This implies that all affected security-sensitive applications, like the “login” program in Thompson's essay, must be updated to work around the now official backdoor in the compiler. In this scenario, even after the attack has been discovered, anytime someone unknowingly compiles an old version of “login” with a recent compiler, it's another win for the attacker.


Fortunately, we agree with Scott Craver that the C programming language is a very good context to be underhanded in, and a C compiler is even better. How about the following ideas?

  1. making pseudo-random number generators that rely on uninitialized memory less random, in the hope that this will result in weaker cryptographic keys for those who do not know about the flaw;
  2. optimizing a NULL test out of kernel code when it is one of several defenses that need to be bypassed;
  3. optimizing buffer + len >= buffer_end || buffer + len < buffer overflow tests out from application code, so that buffer overflows do take place in code that is guarded thus;
  4. optimizing source code that was written to take constant-time into binary code that reveals secrets by terminating early.

I am not being very original. According to this post by Xi Wang, idea (1) is only waiting for someone to give the compiler one last well-calibrated shove. The NULL test optimization was already implemented in the compiler when it was needed for a famous Linux kernel exploit. The interesting scenario would have been if someone had found that the code in tun_chr_poll() was almost exploitable and had submitted the GCC optimization to activate the problem, but it did not happen in this order. Idea (3) really happened.

Idea (4) has not been exploited that I know of, but it is only only a matter of time. If I google for “constant-time memcmp”, I may find an implementation such as follows:

int memcmp_nta(const void *cs, const void *ct, size_t count)
{
  const unsigned char *su1, *su2;
  int res = 0;

  for (su1 = cs, su2 = ct; 0 < count; ++su1, ++su2, count--)
  res |= (*su1 ^ *su2);

  return res;
}

Nothing in the C language definition forces a compiler to compile the above function into a function that does not return as soon as variable res contains (unsigned char)-1, not to mention the possibilities if the compiler first inlines the function in a site where its result is only compared to 0, and then optimizes the code for early termination. If I was trying to sneak in a compiler change that defeats the purpose of this memcmp_nta() function, I would bundle it with auto-vectorization improvements. It is a fashionable topic, and quite exciting if one does not care about non-functional properties such as execution time.

Conclusion

The impracticability of the described attack is counter-balanced by some unusual benefits: at the time of the attack, someone may already have audited the pseudo-random number generator or function memcmp_nta() we used as examples. The audit may have considered both source and generated assembly code, and involved actual tests, at a time when the code was “safely” compiled. But the auditor is not going to come back to the review again and again each time a new compiler comes out. Like Monty Python's Spanish Inquisition, nobody expects the compiler-generated backdoor.


Three of my four examples involve undefined behavior. More generally, my examples all involve unwarranted programmer expectations about C idioms. This is the key to plausibly deniable compiler changes that reveal latent security flaws. What other unwarranted expectations should we take advantage of for an “attack by compiler”?

Thursday, April 4 2013

Google forking WebKit

Blink as seen from the inside

As you have undoubtedly heard if you follow at all this sort of thing, as of April 3, Google is forking WebKit. Its Chrome browser will henceforth rely on its own variation of the popular rendering engine, Blink. This is big news. If I were Google, I would have pulled a Gmail and made this serious announcement two days earlier. Google security engineer Justin Schuh gives the inside scoop.


Something particularly strikes me in Justin's post. This post reacts to that aspect of Justin's, and to that aspect only.

As of 2013, Cybersecurity is an unsolved problem

Let us be honest, cybersecurity, in general, in 2013, is a bit of an unsolved problem. Despite this state of affair, or because of it, it can be difficult to get some of the concerned companies to recognize there exists a problem at all.

Fortunately, some are more open than others regarding security. Google is more open than most, and when an exceptional event, such as the forking of WebKit, justifies it, someone there may remind us of all the Chrome team does for security in a convenient list:

[…] the Chrome security team has taken a very active role in WebKit security over the last several years, and really led the pack in making Webkit more robust against exploits. We’ve fuzzed at previously unheard of scale <goo.gl/R718K>, paid out hundreds of thousands of dollars in bug bounties <goo.gl/PGr89>, performed extensive code auditing, fixed many hundreds of security bugs, and introduced a slew of hardening measures. […]


My point is only tangentially relevant to web rendering engines, so I am taking the above list out of context, and adding my own punchline: despite of all these efforts, measured in millions of dollars, it is likely that there is still one exploitable security flaw in Chrome, and it is possible that, as you read this, someone somewhere is exploiting this flaw.


To be very clear, I am not blaming Google here. Google allocates the millions of dollars that cybersecurity warrants. It is open about how it uses this money; Google uses all the techniques that have proved their worthiness: random testing, bug bounties, code reviews, defensive programming. I am only saying that cybersecurity is difficult, since even doing all this does not guarantee absolute confidence. If you were to embed Chromium on some kind of connected device, even after doing all the above, you had better leave room in the design for regular software updates.

Saturday, February 16 2013

From Facebook to Silent Circle, through the “metrics” Frama-C plug-in

From Facebook to Silent Circle

Some computers at Facebook were recently compromised because of a zero-day in Java. Nothing unexpected. Last december, instead of writing a full blog post, I lazily linked to Robert Graham predicting this sort of thing for the year 2013.


Speaking of Facebook, do you know who cares for your privacy? The good people at Silent Circle. They provide end-to-end encryption to use on your mobile phone, so that the only people to hear your discussions with your loved ones or business partners are you and them, and perhaps passengers in the same compartment as you, if you are this sort of sociopath.


Speaking of Robert Graham, he has written another excellent post, on the dangers of only focusing on the obviously important, the hard-to-get-right but known-to-be-hard-to-get-right cryptographic stuff, and neglecting the mundane, such as parsing. He takes the example of source code recently released by Silent Circle, finding traces of amateurism in the implementation of SDP. This is not were the hard cryptographic stuff takes place. But because this is a C++ implementation, any exploitable flaw at this level could have deep security implications. A buffer overflow could result in execution of code chosen by the attacker, and the attacker-chosen code could disable the cryptography entirely, or worse.

An attack on the protocol implementation is the equivalent of a $5 wrench in a well-known cryptography-related XKCD comic that probably does not require linking to at this point.

A mysterious commit message

[Metrics] Fix most complicated plugin ever, metrics (globals counted twice if declared and defined)

The above is the message attached by a facetious colleague to commit 21435 during the just elapsed week. After inquiry, the message is at least one-third sarcastic. Still, no-one ever said that the metrics plug-in should be simpler or easier to get right on first try than any of the other Frama-C plug-ins. And in fact, this was neither the first nor the last minor issue fixed in that plug-in last week.


Frama-C's metrics plug-in gathers various measurements of a syntactic nature. The “metrics” plug-in can help check how much of an unfamiliar codebase has actually been analyzed with a plug-in of a more semantic flavor, such as the value analysis. It is used, in context, in the next section of this post.

One reason “metrics” as hard to get right as any other Frama-C plug-in is that it deals with syntax. Semantic plug-ins such as the value analysis only need manipulate abstract objects such as functions and variables. In the metrics plug-in, details matter, such as the nuance between the declaration (in a .h header file) and the definition (in a .c file) of the same function (that was the bug being fixed in commit 21435), or the subtle difference between the source location of a variable and the source location of its initializer (that was another bug fixed this week).


Metaphorically, the metrics plug-in is to the Frama-C platform what the SDP parser is to an end-to-end encryption solution. It is not sexy, but it is just as hard to get right and as essential as the rest.

Using the metrics plug-in

Life being what it is, we sometimes find ourselves analyzing very unfamiliar code. I am talking code without documentation here, and often without either the hardware or a minimal description of the hardware that would be necessary to understand what behaviors can be exercised in the source code. This is one case when the metrics plug-in helps either anticipate how much work will be necessary, or how much has already been done.


Inspired by Robert Graham's post, I converted the Silent Circle SDP parser from C++ to C so as to see whether I could not find a bug in it using Frama-C's value analysis. At some point I had this modified parseSDP.c file and this modified sdp.h file. The originals can be seen here for parseSDP.c and sdp.h.

$ frama-c -cpp-command "gcc -C -E -I. -I `frama-c -print-share-path`/libc" parseSDP.c -metrics
[kernel] preprocessing with "gcc -C -E -I. -I /usr/local/share/frama-c/libc  parseSDP.c"
parseSDP.c:120:[kernel] warning: Calling undeclared function ipstr2long. Old style K&R code?
parseSDP.c:121:[kernel] warning: Calling undeclared function reverseIP. Old style K&R code?
parseSDP.c:328:[kernel] warning: Calling undeclared function u. Old style K&R code?
[metrics] Defined functions (6)
          =====================
           findMediaId (0 call); hasAttrib (0 call); findCrLf (4 calls);
           findip (3 calls); getMediaParams (1 call); parseSDP (0 call); 
          
          Undefined functions (10)
          ========================
           printf (1 call); strlen (1 call); strncmp (6 calls); atoi (2 calls);
           isdigit (4 calls); islower (1 call); isxdigit (1 call); ipstr2long (1 call);
           reverseIP (1 call); u (4 calls); 
          
          Potential entry points (3)
          ==========================
           findMediaId; hasAttrib; parseSDP; 
...

This snapshot arrives at a good time to use the metric plug-in, as above. The plug-in says that good entry points to start the analysis from are findMediaId(), hasAttrib() and parseSDP(). The heuristic here is that these functions are provided and aren't called, so they are likely to be part of the API.

The metrics plug-in also says that the only missing functions are those from printf() to isxdigit() (these are part of the standard library), u() (I introduced this function myself when converting from C++ to C), and ipstr2long() and reverseIP(), that are the only two that require digging up from elsewhere in the codebase. At that point I decided that the functions probably weren't important and that I could let Frama-C infer some sort of behavior for them based on their respective types.

The $5-wrench kind of flaw in the verification process would be to fail to notice that an important function is missing. Then the value analysis might report few or no issues, not because the code is safe, but because the analysis failed to visit most of it. In another scenario, the value analysis might fail to visit a large part of the code because there is an error early on that makes it look like the rest is unreachable. The “metrics” plug-in is the best safeguard we have against this kind of mistake, which is easy to make when verifying unfamiliar code.

A quick look at function parseSDP()

I finally used the commandline:

$ frama-c -cpp-command "gcc -C -E -I. -I `frama-c -print-share-path`/libc" parseSDP.c \
      -lib-entry -main parseSDP -context-valid-pointers -context-width 20 -val


Here are a few alarms found this way, cherry-picked as interesting because they are caused directly from the contents of the buffer pointed to by buf, that we assume to be under an attacker's control for this exercise.

Some alarms seem by contrast to be caused by the contents of *psdp. These are not as interesting to investigate, because that structure probably has some invariants it is supposed to satisfy, so the alarms we get there are only caused by the lack of context.

From the way it is handled inside parseSDP(), it seems that buf is only assumed to point to a zero-terminated string. It is a good way to pass a Sunday afternoon to see whether these alarms correspond to a real danger of buffer overflow for some input.

parseSDP.c:84:[kernel] warning: out of bounds read. assert \valid_read(buf+1);
parseSDP.c:82:[kernel] warning: out of bounds read. assert \valid_read(buf);
...
parseSDP.c:162:[kernel] warning: out of bounds read. assert \valid_read(buf);
parseSDP.c:165:[kernel] warning: out of bounds read. assert \valid_read(buf-2);
parseSDP.c:167:[kernel] warning: out of bounds read. assert \valid_read(buf+1);
...
parseSDP.c:262:[kernel] warning: out of bounds read. assert \valid_read(buf);
...
parseSDP.c:291:[kernel] warning: out of bounds read. assert \valid_read(buf);
parseSDP.c:294:[kernel] warning: out of bounds read. assert \valid_read(buf);
...
parseSDP.c:300:[kernel] warning: out of bounds read. assert \valid_read(buf);
...
parseSDP.c:303:[kernel] warning: out of bounds read. assert \valid_read(buf);
...
parseSDP.c:308:[kernel] warning: out of bounds read. assert \valid_read(buf);
...
parseSDP.c:308:[kernel] warning: out of bounds read. assert \valid_read(buf);
parseSDP.c:309:[kernel] warning: out of bounds read. assert \valid_read(buf);

Friday, February 1 2013

ENSL seminar

Seminar

As anticipated, I was at my alma mater's student seminar last tuesday. School and seminar were very much like I remembered them. The latter was improved by orange juice and biscuits to chat around after the talk, that I do not think were part of the protocol when I was a student.

My years at the ENS of Lyon were a good time, but I had forgotten the reasons that made them so. If any of the students from the seminar read this, I hope they will pass on my thanks to everyone for reminding me why. In one word, it is the company.


The sort of discussion that emerges naturally in an open, low-stakes chat after a general presentation of Frama-C but is too open-ended to be part of the talk's questions is that of general usefulness.

Not every programmer is writing critical embedded code—in fact, most of us aren't. But in this day and age, many of us who program at all write code with mild to severe security implications, and I believe that with a reasonable amount of work, Frama-C can next help here.

Video games, and software security in the news

The seminar also gave me an excuse to visit my family that lives in the general area of Lyon. As a result, today, I was helping my nephew play Professor Fizzwizzle (now an online game, apparently) after lunch, when the 13:00 TV news announced that some webcams, sold for the purpose of allowing homeowners to watch their living room remotely and reassure themselves that they are not being burgled, instead allowed anyone with internet access to watch, at any time.

This was not by design, of course. This was the result of a security flaw. I am not saying that the security flaw was one of the kind Frama-C identifies, say, a buffer overflow in a C program: the France 2 news report (in French, available for a week) did not go into such detail. But I am willing to bet that software was involved.

The future

So what world is my three-year-old nephew going to grow in? Either we collectively solve the software security problem, and hopefully, when he is in age of lending an ear to TV news while playing Professor Fizzwizzle, there won't be any more of these spectacular failures to announce. Or we do not, and then, at the current rhythm, TV news won't announce security failures because they will not be news any more.

A reflexive illustration

One last thing: this blog post contains two links, one that I expect to be to a Java web applet (my nephew was playing a game that I bought a license for and downloaded the old-fashioned way in the early 2000s, but it always felt like Java, especially at launch-time). The other is to a Flash video. If either of them worked for you, you should seriously think about disabling plug-ins in your browser. Yes, you will be missing out, but frankly, this is the only solution at this time.


Students from ENSL or elsewhere, if you would like to help solve this problem (at least the part where embedded-like C code is concerned) and at the same time solve your problem of picking a first- or second-year internship, e-mail addresses of Frama-C developers are either firstname dot lastname at inria dot fr for contributors from Inria, or firstname dot lastname at cea dot fr for contributors from CEA (including me).

- page 1 of 2