Archive for December, 2011

Changing Java’s Semantics for Handling Null Pointer Exceptions

We envision a world where no exceptions are raised; instead, language semantics are changed
so that operations are total functions. Either an operation executes normally or tailored
recovery code is applied where exceptions would have been raised. As an initial step and
evaluation of this idea, we propose to transform programs so that null pointer dereferences
are handled automatically without a large runtime overhead. We increase robustness by replacing
code that raises null pointer exceptions with error handling code, allowing the program to
continue execution. Our technique first finds potential null pointer dereferences and then
automatically transforms programs to insert null checks and error-handling code. These
transformations are guided by composable, context sensitive recovery policies. Error-handling
code may,  for example, create default objects of the appropriate types, or restore data
structure invariants. If no null pointers would be dereferenced, the transformed program behaves
just as the original. We applied our transformation in experiments involving multiple benchmarks,
the Java Standard Library, and externally reported null pointer exceptions. Our technique is
able to handle the reported exceptions and allow the programs to continue to do useful work, with
an average execution time overhead of less than 1% and an average byte code space overhead of 22%.

Null pointer exception management is a logical starting point for changing Java’s semantics for
exception handling, because of the simplicity and regularity and null pointer exceptions. Null pointer
exceptions, while conceptually simple, remain prevalent in practice. Null pointer dereferences are not
only frequent, but also catastrophic and are “a very serious threat to the safety of programs”. Many
classes of null pointer exceptions can be found automatically by static analyses, and they have been
reported as one of the top ten causes of common web application security risks. Addressing such risks
with fault-tolerance techniques is a promising avenue. For example, techniques that mask memory errors
have successfully eliminated security vulnerabilities in servers.

Though Java already provides an infrastructure for exceptions, the current state of the language is
only a partial solution. Java makes a clear distinction between checked and unchecked exceptions.
The former are included in method type signatures and must be addressed by the programmer; the latter
may be ignored without compiler complaint. Unchecked exceptions should also be documented and properly
handled by the language, in a systematic and universal manner. Java treats null pointer exceptions as
unchecked by default, while APPEND’s approach to null pointer prevention is similar to the way Java
treats checked exceptions: an undesirable situation or behavior is identified by the programmer, and
some error handling code is generated. One reason null pointer exceptions are not treated as checked by
Java is that there are many potential sources of null pointer dereferences and different recovery
situations would have to be embedded in multiple local catch blocks explicitly: a time-consuming and
error-prone task. First, it would be difficult to identify, for each null pointer exception that
propagated upwards, what kind of recovery code could be applied, without knowing context information.
Secondly, Java’s current exception handling mechanisms also open up the possibility of a breakdown of
encapsulation and information hiding, as implementation details from lower levels of scope are raised
to the top level of the program. A solution to null pointer exceptions that is able to prevent or mask
them in a way that is both reasonable and accessible to the programmer has yet to be implemented.

APPEND is a program transformation that changes Java’s null pointer exception handling by automatically
inserting null checks and error-handling code. No program annotations are required, and developers need
not wade through defect reports. Programs are modified according to composable recovery policies.
Recovery policies are executed at compile-time and, depending on the context, recovery code is inserted
that is then executed at run-time if the null checks fail. Recovery policies are conceptually related to
theorem prover tactics and tacticals or to certain classes of aspect-oriented programming. If no null
values are dereferenced at run-time, the transformed program behaves just as the original program. If the
original program would dereference a null value, the transformed program instead executes the policy
dictated error-handling code, such as creating a default value on the fly or not calculating that
expression. Previous research has suggested that programs might successfully continue even with discarded
instructions; we present and measure a concrete, low-level, annotation-free version of such a system, and
extend it to allow for user-specified actions.

The idea behind this approach is that null pointer dereferences are undesirable, especially in circumstances
where they are involved in non-critical computations where the program is forced to crash if one is
encountered. If we had a way of preventing the program from ceasing execution, while logging and performing
some sort of recovery code instead of raising an exception, we hypothesize that there are many applications
where this behavior would be preferred. Therefore, it becomes possible to check every pointer dereference
in the code for nullness, and to include recovery code for every such instance. We argue that this is a
practical and preferable way to deal with null pointer exceptions in Java.Because we intend to check every
pointer dereference for nullness, we could have chosen to take advantage of the existing null checking of
the Java virtual machine. Given the low overhead of our tool, we chose to work on the application level
instead, to remain portable and not have to rely on a single modified JVM instantiation.The transformation
can be implemented directly atop existing program transformation frameworks and dovetails easily with
standard development processes.

, , , , , , , ,

Leave a comment

Write Optimized Tree Indexing

Due to the poor random write performance of flash SSDs (Solid State Drives), write optimized tree indexes have been proposed to improve the update performance. BFTL was proposed to balance the inferior random write performance and fast random read performance for flash memory based sensor nodes and embedded systems. It allows the index entries in one logical B-tree node to span over multiple physical pages, and maintains an in-memory table to map each B-tree node to multiple physical pages. Newly inserted entries are packed and then written together to some new blocks. The table entries of corresponding B-tree nodes are updated, thus reducing the number of random writes. However, BFTL entails a high search cost since it accesses multiple disk pages to search a single tree node. Furthermore, even though the in-memory mapping table is compact, the memory consumption is still high. FlashDB was proposed to implement a self-tuning scheme between standard B+-tree and BFTL, depending on the workloads and the types of flash devices. Since our proposed index mostly outperforms both B+-tree and BFTL under various workloads on different flash SSDs, we do not compare our index with this self-tuning index in this paper. More recently, LA-tree was proposed for flash memory devices by adding adaptive buffers between tree nodes. LA-tree focuses on raw, small-capacity and byte addressable flash memory devices, such as sensor nodes, whereas our work is targeted for off theshelf large flash SSDs, which provide only a block-based access interface. Different target devices of these two indexes result in their differences in design.

On the hard disk, many disk-based indexes optimized for write operations have also been proposed.
Graefe proposed a write-optimized B-tree by applying the idea of the log file system to the B-tree
index. Y-tree supports high volume insertions for data warehouses following the idea of buffer tree.
The logarithmic structures have been widely applied to optimize the write performance. O’Neil et al.
proposed LSM-tree and its variant LHAM for multi-version databases. Jagadish et al. used a similar
idea to design a stepped tree index and the hash index for data warehouses. Our FD-tree follows the
idea of logarithmic method. The major difference is that we propose a novel method based on the
fractional cascading technique to improve the search performance on the logarithmic structure.

, , , , , , , ,

Leave a comment

Helix

Helix is a high-speed stream cipher with a built-in MAC functionality. On a Pentium II CPU it
is about twice as fast as Rijndael or Twofish, and comparable in speed to RC4. The overhead per
encrypted/authenticated message is low, making it suitable for small messages. It is efficient
in both hardware and software, and with some pre-computation can effectively switch keys on a
per-message basis without additional overhead.

Basic security services require both encryption and authentication. This is (almost) always done
using a symmetric cipher—public-key systems are only used to set up symmetric keys—and a Message
Authentication Code (MAC). The AES process provided a number of very good block cipher designs,
as well as a new block cipher standard. The cryptographic community learned a lot during the
selection process about the engineering criteria for a good cipher. AES candidates were compared
in performance and cost in many different implementation settings. We learned more about the
importance of fast rekeying and tiny-memory implementations, the cost of S-boxes and circuitdepth
for hardware implementations, the slowness of multiplication on some platforms, and other
performance considerations.

The community also learned about the difference of cryptanalysis in theory versus cryptanalysis
in practice. Many block cipher modes restrict the types of attack that can be performed on the
underlying block cipher. Yet the generally accepted attack model for block ciphers is very
liberal. Any method that distinguishes the block cipher from a random permutation is considered
an attack. Each block cipher operation must protect against all types of attack. The resulting
over-engineering leads to inefficiencies.

Computer network properties like synchronization and error correction have eliminated the
traditional synchronization problems of stream-cipher modes like OFB. Furthermore, stream ciphers
have different implementation properties that restrict the cryptanalyst. They only receive their
inputs once (a key and a nonce) and then produce a long stream of pseudo-random data. A stream
cipher can start with a strong cryptographic operation to thoroughly mix the key and nonce into
a state, and then use that state and a simpler mixing operation to produce the key stream. If the
attacker tries to manipulate the inputs to the cipher he encounters the strong cryptographic
operation. Alternatively he can analyse the key stream, but this is a static analysis only. As
far as we know, static attacks are much less powerful than dynamic attacks. As there are fewer
cryptographic requirements to fulfill, we believe that the key stream generation function can be
made significantly faster, per message byte, than a block cipher can be. Given the suitability of
steam ciphers for many practical tasks and the potential for faster implementations, we believe
that stream ciphers are a fruitful area of research.

Additionally, a stream cipher is often implemented—and from a cryptographic point of view, should
always be implemented—together with a MAC. Encryption and authentication go hand in hand, and
significant vulnerabilities can result if encryption is implemented without authentication.
Outside the cryptographic literature, not using a proper MAC is one of the commonly encountered
errors in stream cipher systems. A stream cipher with built-in MAC is much more likely to be used
correctly, because it provides a MAC without the associated performance penalties.

Helix is a combined stream cipher and MAC function, and directly provides the authenticated
encryption functionality. By incorporating the plaintext into the stream cipher state Helix can
provide the authentication functionality without extra costs.Helix’s design strength is 128 bits,
which means that we expect that no attack on the cipher exists that requires fewer than 2^128
Helix block function evaluations to be carried out. Helix can process data in less than 7 clock
cycles per byte on a Pentium II CPU, more than twice as fast as AES. Helix uses a 256-bit key and
a 128-bit nonce. The key is secret, and the nonce is typically public knowledge. Helix is
optimised for 32-bit platforms; all operations are on 32-bit words.

, , , , , , , ,

Leave a comment

Improved B-trees To Reduce Maintenance Cost And Fragmentation

Database warehouses support processing very large amounts of data. The workload is primarily composed
of SELECT queries; updates are less frequent and typically come in batches. And yet the INSERT operation
can be expensive and have a significant impact on SELECT query performance. Inserted data has to be
placed in its respective location based on presence of indexing structures; excepting the index based on
the arrival time stamp, the updates are likely to manifest all over the data range. If the modification
do not exhibit an access locality, each one will touch one or more different disk pages causing a high
performance penalty (irrespective of any batching). The relevant page has to be read, modified and
eventually written back to disk once the memory buffer fills up. In this paper we propose a Fractured
Index mechanism to alleviate the penalties associated with the update costs while still being able to
use B-Tree indexes. By creating a mini-Index for each batch of data, we ensure insert locality (all data
within the mini-Index is freshly inserted) and keep the indexes read-only (achieving better fragmentation
of these indexes and providing compression opportunities). The idea of batching is frequently used in
databases; storing insert batches in the B-Tree structures is a logical extension of the same approach.
Although earlier work has presented a similar delayed update approach, we believe that our idea is much
simpler to implement.This minimalistic implementation achieves two orders of magnitude improvement of
INSERTs and one order of magnitude improvement of SELECT queries.

B-tree provides a fast read access to data by sorting the data by indexed attribute (key) values and
hierarchically organizing the data into intermediate and leaf nodes. B-tree splits and merges nodes upon
insertions and deletions to keep the tree balanced. Most database products provide two types of B-tree
indexes, primary (clustered) and secondary (non-clustered). A primary B-tree index stores tuple data in
its leaf nodes while a secondary B-tree index stores only pointers to the primary index (or heap if no
primary index exists). A primary B-tree index provides a faster read access than a secondary B-tree index,
especially for Online Analytical Processing (OLAP) style queries because an access path using a secondary
Btree index needs a random disk seek for each tuple to be fetched while OLAP queries are often not selective.
A primary B-tree index requires only one seek to locate the needed tuples and then performs a sequential scan.
This is substantially more efficient because one random disk seek takes several milliseconds while a
sequential disk access takes only microseconds to read a tuple.

Although reads on B-tree in its initial state are efficient, writes on B-tree create two problems. First, each
write operation requires random disk seeks to access and overwrite data in B-tree. Second, splitting and
merging nodes can cause severe fragmentation of B-tree structure on disk. This can significantly degrade the
read performance on B-tree because each fragment requires a jump (seek) even for sequential read on a primary
index. These problems become more and more significant as the random disk seek performance has not been
increasing over a few decades while disk capacity and sequential disk access performance are rapidly
increasing. we propose a new data structure, Fractured Index, which overcomes the problems of B-tree by using
only sequential accesses on disk. Fractured Index is a few orders of magnitude faster than B-tree to maintain
and also an order of magnitude faster to query after heavy write operations because Fractured Index causes no
fragmentation on disk. It is especially beneficial for OLAP type queries which are less selective. It mainly
focus on the context of OLAP, but the same technique could be used to Online Transaction Processing (OLTP)
setting to achieve the better maintenance performance with a comparatively more performance penalties on read
queries.

Architecture

The key idea of Fractured Index is to sequentially output small B-trees to reduce the overhead of inserts
into tables that are organized as primary and secondary B-trees. Inserting new tuples to random places of the
B-tree is a costly operation which cause random disk seeks and fragmentation due to splits and merges of
B-tree nodes. DBMS uses a buffer pool to alleviate this problem by keeping dirty (modified) pages in memory
and later writing them back to disk, but it is impossible for the buffer pool to contain all of the necessary
leaf pages when millions of new tuples are inserted to the main table. The leaf pages are continuously being
swapped between the disk and buffer pool as inserts continue to come in unless new tuples are coming in the
order of index keys, which does not occur in general case. This results in a large number of disk seeks and
fragmented B-trees.

The Fractured Index mitigates these problems by sequentially recording data changes (insertions and deletions)
into small B-trees separated from the main table. In order to sequentially write small B-trees (mini-index),
the Fractured Index buffers insertions and deletions into the database in Insert Buffer and dumps the data
changes to a new mini-index when the current insert buffer becomes full. When reading the database, the query
processor reads from each mini-index in addition to the main table.

, , , , , , , ,

Leave a comment

Detecting Null Pointer Violations in Java Programs

The use of formal methods has been growing steadily and there have been a number of successful
applications of formal methods in a range of application areas. It seems agreed that quality should
be assured by applying testing, analysis and formal methods to rigorously dened precode artifacts.
The detection of null pointer violation errors is definitely such a goal. This way of applying formal
methods has a great potential to increase our confidence in the software. The goal is to provide a
practical mechanism to assist the application of formal methods in the early detection of null pointer
violation errors in programs and the solution is theorem proving based and is focused on the
identification of the possible places in which a theorem prover could assist in the detection of null
pointer violation errors and the formulation of the necessary proof obligations.

The theorem prover partially assists in checking the conditions used to control the logical flow of a
program by evaluating the proof obligations formed from the path under examination. A research prototype
is under development and the preliminary results are encouraging and demonstrate the feasibility and
effectiveness of our approach.

public class MyClass1 {
protected String s1, s2;
// constructor 1 public void method2 () {
public MyClass1(…) { String s3;
s1 = new String(…); if (cond1) {
} s3 = new String(…);
}
// constructor2 //…
public MyClass1(…) { if (cond2) {
s1 = new String(…); …s3.length…
s2 = new String(…); }
}
}
public void method1() {
… s2.length() …
}
A null pointer example-1

Null Pointer Detection is fundamental to the problem of program analysis. Some of the problems related
to detecting null pointer errors are aliasing, name collision, unexecutable paths, uninitialized
pointers, Any tool involving automatic program analysis is likely to benefit from information about null
pointers in a program. Null pointer information is essential to knowing the state of the pointer at an
arbitrary point in a program for conducting other analyses (array index out of bounds) class cast exception
Consider the example in example-1.

In MyClass1 two instance variables, s1 and s2, are declared. However the programmer forgets to initialize s2
in constructor 1, which is a very common mistake – forgetfulness. The consequence is that the dereferencing
of s2 in method1 may cause a null pointer exception at runtime. In method2, the local variable s3 is
initialized inside an if statement, and is dereferenced inside a different if statement. The dereference
may also cause a null pointer exception at runtime if cond2 is not implied by cond1. These types of errors
are very common in programs.

, , , , , , , ,

1 Comment

Checking and Signing XML Documents on Java Smart Cards

Smart card assistance for generating digital signatures is current state of the art and best
practice. This is mainly due to the fact that smart cards now a days have enough processing power
to produce digital signatures for documents by on card resources (processor and memory) only.
This way the owner’s private signing key never has to leave the smart card: The signing key is
and remains permanently stored in a tamper proof environment. A closer look at the signing
process however reveals a still existing major security problem: the problem known as the “what
you see is what you sign” problem. Before signing a document the signer usually wants to check
the document’s syntactic and semantic correctness.

When compared to the traditional process of signing a paper document with a hand written signature,
the difference can easily be identified: In the traditional case, it is relatively easy for the user
to assert the correctness, because syntactic and semantic document checking and signature generation
are in immediate context. Digitally signing an electronic document is completely different, because
checking and signature generation are executed in two different environments, exposing fundamentally
different characteristics different with respect to security on the one hand and processor, memory,
and display resources on the other hand.

Traditionally, the signing application computes the document’s digest using a one way hash function
and sends the result to the smart card. The card encrypts the digest by an asymmetric cipher using the
signing key stored on the card. The resulting value is the digital signature of the document. It is sent
back to the signing application. The user can neither check the syntactic correctness of the document
(in case of XML documents: well formedness, validity) nor the semantic correctness of the document. What
really is signed is beyond the user’s control. It might for instance be the digest for a manipulated
document. Even if the smart card can be regarded as tamper proof, the terminal (e.g. a PC) and the
programs running on it are vulnerable to viruses and Trojan horses. Such evildoers might obviously also
affect signing applications and let them produce valid signatures for from the user’s perspective invalid
documents. Such incidents invalidate the signing process in total.

We propose an enhanced architecture which performs checking and signing of XML documents on Java smart
cards, called JXCS architecture. The basic idea of JXCS is to shift the syntactic validation and hash value
generation from the vulnerable PC to the trusted smart card. Syntactic validation imposes the following
challenges and opportunities: Challenging is the need of processing XML documents on resource constraint
Java smart cards. The opportunity of the approach is the possibility to perform syntactic and even semantic
checks on the XML document in a tamper proof environment which improves the security of the signing process.
We propose the need for three major checks on the XML documents to be signed: Well formedness, validity and
content acknowledgement using a class 3 card reader. Taken together all three checks can defeat “what you
see is what you sign” attacks.

, , , , , , , ,

1 Comment

%d bloggers like this: