Archive for November, 2012

What happens if the RTT is broken? – dynamic setting of system parameters

Since our experience with RTTs is relatively new, we cannot be sure that RTTs that are considered secure today will still be secure in the future. In particular, we must assume that once RTTs are used to secure something of value they will become the target of many hacking attempts, and attackers might find a way of efficiently breaking RTTs. We should therefore be prepared for the event that RTTs that are used by the system are broken. The analysis below also applies to the situation where an adversary is not able to technically break the RTT, but can obtain a large amount of human help to answer RTTs.

Identifying a successful attack : The first challenge is identifying that attackers are able to break the RTTs that are used by the system. The following techniques will help to identify large scale break-in attempts. During normal operation we can expect a certain fraction of login attempts to pass the RTT but fail in the password entry. These are mostly due to users that mistype their passwords, or to amateurs that run manual dictionary attacks. When the RTT is broken by an automated program, or possibly by attackers that organize “sweatshops” of low paid workers that are solving RTTs, this fraction should go up noticeably (once the RTT is broken the attackers can pass this test, but should still find it hard to guess the correct password). When the system recognizes that the fraction of unsuccessful login attempts that solve the RTT increases, it should assume that the RTT is broken.

Countermeasures :  Once it is recognized that attackers can break the RTT, the first line of defense should be increasing the fraction p of login attempts for which an RTT is required. (At an extreme, the system can require more than one RTT to be solved for every login attempt, until it finds a different solution against the attack. This corresponds to a value p greater than 1, in terms of calculating the expected number of RTTs that have to be broken). This step requires the attacker to solve more RTTs for every account that it attempts to break. If the attack is conducted using human workers, or is computationally intensive, then the operation of the attacker
is slowed down until it recruits more human workers or installs more RTT breaking servers. We should therefore examine whether the fraction of login attempts that pass the RTT but fail the password entry decreases when p is increased. If this is not the case, then we should assume that the attacker has found a scalable way of breaking RTTs. The system must therefore change the RTT that is being used, in hope that the attacker does not have an efficient way of solving it. (This change does not slow down human attackers, but we believe, based on the quantitative analysis presented above, that the cost of human attacks is too high for them to be effective, e.g.
requires almost a week of human work per account.) The operators of the system must therefore have a set of different RTT tests. Once a successful automated attack is detected the system should switch to a new test, preferably taken from a new domain and sufficiently different from the existing test.

Additional security against attacks on individual accounts can be provided in the following way. When the server notices that there is a high number of failed login attempts to one (or a small number of) accounts, a flag is raised warning the security administrator that with high probability an attack is mounted on these accounts. The system could then increase, for these individual accounts, the fraction of login attempts that require RTTs, or even change the type of RTTs that should be solved, in order to slow down an attacker. The administrator may also contact the user via an out-of-band channel, such as email or phone, to agree on an alternative login method. The key advantages of the RTT scheme in this case are that (1) the security administrator is given significantly more time to react than in ”conventional” schemes where an account can be locked within seconds (e.g. after 5 unsuccessful login attempts), and that (2) a legitimate user will still be able to login to his account during this time period.

, , , , , ,

Leave a comment

The pyramidal computing resources infrastructure

Supercomputers are often referred to as forming the top of a pyramid of resources, which then in turn are referred to as mid-level systems and basic resources at the bottom. Although the various pyramids may yet differ in height or broadness of base, this picture makes much sense. The picture usually reflects at the same time different layers of responsibility, such as – going from bottom to top – to individuals and research groups, universities of institutes, national and/or European responsibilities. Again at the same time these responsibilities reflect usually also the cost in terms of real money of investment in any of these layers. The higher the direct investment level, the more cooperation and the more strategic visions are required to get the funding and the broader context in which the funding takes place. But the pyramid is not just a series of resources ranked after performance and stacked on top of each other. The pyramid is essential to a solid research computing environment and should be kept in tact at all times.

For the foreseeable future there seems no limit in wanted computer resources/cycles for science and research. This insatiable need can only be satisfied if at all levels of the pyramid the required resources are available and sustainable: desktop facilities, local clusters (or a gridified cluster environment) for direct and on demand access, higher level systems for (very) demanding codes at the national level (or an equivalent gridified environment) and national/European level supercomputers for “qualified” science/research. The pyramidal structure is essential for this scheme, because building on the top of the pyramid only makes sense if the base is well founded. If the middle layer, that is due to be formed by national HPC systems, is absent, the work permeates one way or another to either other layers. This leads to a cost ineffective use of those resources if not directly to a shortfall in scientific production and quality. The disappearance of the middle layer may readily result from a funding scheme for the top where effectively the funding for the middle layer is used to pay for the top. Such a scheme would not yield a continuum of resources and certainly not more cycles for scientists but merely form another bureaucratic complication for the access to the still urgently needed cycles by scientists and researchers.

The grid infrastructure is the vehicle that glues all resources in the pyramid to one transparent environment and that has to guarantee that all resources in the pyramid will be optimally used for their very purposes. This should not be under utilized nor be infinitely queued up. The layers in the pyramid are admittedly not very sharp. A system based on a given technology may be in the basic layer if the number of processors is small, in the middle layer if the number is significant and at the top if their number is abundant. But in addition to these layers based on performance of “size” it is also the access possibilities that co-determine the layers: at the basic level the access is instantaneous, at the middle level there may be queues or appointments with colleagues or other user groups about sharing policies. At the top an additional hurdle must be taken, which usually is peer review. But without local and institutional /national and easily accessible resources it will be difficult to make one’s case for access to the peer reviewed environment.

The form of the European-wide pyramid may, in addition to the elements already mentioned, become more dependent on the total cost of ownership/cost per cycle than has been the case. This is due to a few parameters that were around already but have become differently valued more recently. Such parameters include energy cost, heat production, support cost.

It is important to provide the suitable level of resources in the pyramid for the research groups in order to guarantee cost efficient usage and still reserve the high-end resources only to those applications that can really benefit from such a system.

Figure 1: Schematic view on the integrated performance with an 18-months investment cycle, a doubling of the performance of each new system and an 6-year depreciation period (y-axis is logarithmic)

, , , , , , , , ,

Leave a comment

Combining Computation and Communication Costs

We have so far been concerned with the communication costs incurred by repartitioning and have considered the cost of computing (i.e. disk and cpu costs other than communication) an operator to be independent of the partitioning attribute. We now extend our model to account for the interaction of these costs and show how the basic ideas of the ColorSplit algorithm carry over to the extended model to yield an efficient optimization algorithm.  Several alternate strategies, each with a different cost, may be available for an operator. The following example shows the interaction between between computation and communication costs using the standard scenario of having several strategies for computing operators such as joins and grouping.

Example 1 Given the schema Emp ( emp# , salary, dep# , city) and Dep(dep#, city), the following query finds the average salaries of employees grouped by city for those employees who live in the same city as the the location of their department.

Select, avg( e.salary)
From Emp e, Dep d
Where e.dep# = d.dep# and =
Group by;

Suppose Emp is partitioned by city and each partition is stored in sorted order by city. Suppose Dep is partitioned by dep# and each partition has an index on dep#. Figure 10 shows two query trees. The computation of Avg is assumed to be combined with GroupBy. The first query tree uses the join predicate on dep# and repartitions the Emp table. Due to the availability of an index on Dep, a
nested-loops strategy may be the cheapest for joining each partition of Emp (outer) with its corresponding partition of Dep (inner). The grouping operator is implemented by a hash-grouping strategy.

The second query tree uses the join predicate on city and repartitions the Dep table. Since each partition of Emp is presorted, it may be cheapest o use a sort-mergejo in for joining corresponding partitions. Since the output of merge join is pre-sorted in addition to being pre-partitioned on the city, the grouping operator uses a sort-grouping strategy.  The example illustrates several points. Firstly, while partitioning impacts communication costs, other physical properties (sort-order and indexes) impact computation costs. We will generalize the notion of a color to capture all physical properties.

Secondly, a strategy expects its inputs to have certain physical properties and guarantees its output to have some other properties. We will specify such input-output constraints using color patterns.

Thirdly, the overall cost is reduced when an input to a strategy happens to have the expected physical property. We will therefore break the cost of computing an operator into the intrinsic cost of the strategy itself and the cost of getting the inputs into the right form. The latter will be modeled as a recoloring cost that may or may not be incurred.


, , , , ,

Leave a comment

General Security and Privacy Mechanisms for Liberty Identity Web Services Framework

Security in the Liberty Framework is layered. Liberty protocols are constructed with extensive security mechanisms. Furthermore they build upon various Internet protocols that themselves have embedded security mechanisms.  Table 1 generally summarizes the security mechanisms incorporated in the Liberty specifications, and thus in Liberty enabled implementations, across two axis: channel security and message security. It also generally summarizes the security-oriented processing requirements placed on Liberty implementations.

Table 1. Liberty security mechanisms

Channel security addresses how communication between identity providers, service providers, and user agents is protected. Liberty implementations must use TLS1.0 or SSL3.0 for channel security, although other communication security protocols may also be employed (for example, IPSec) if their security characteristics are equivalent to TLS or SSL. Note: TLS, SSL, and equivalent protocols provide confidentiality and integrity protection to communications between parties as well as authentication.

Critical points of channel security include the following:

  1. In terms of authentication, service providers are required to authenticate identity providers using identity provider server-side certificates. Identity providers have the option to require authentication of service providers using service provider client-side certificates.
  2. The authenticated identity of an identity provider must be presented to a user before the user presents personal authentication data to that identity provider.

Message security addresses security mechanisms applied to the discrete Liberty protocol messages passed between identity providers, service providers, and user agents. These messages are exchanged across the communication channels whose security characteristics were just discussed.

Critical points of message security include the following:

  1. Liberty protocol messages and some of their components are generally required to be digitally signed and verified. Signing and verifying messages provide data integrity, data origin authentication, and a basis for nonrepudiation. Therefore, identity providers and service providers are required to use key pairs that are distinct from the key pairs applied for TLS and SSL channel protection and that are suitable for long-term signatures.
  2. In transactions between service providers and identity providers, requests are required to be protected against replay, and received responses are required to be checked for correct correspondence with issued requests. Time based assurance of freshness may be employed. These techniques provide transaction integrity.

To federate, providers are required to join communities of trust such as PKI or Kerberos frameworks, or to establish bilateral agreements. These should include obtaining X.509 credentials, establishing and managing trusted public keys, and managing life cycles of corresponding credentials.

Many of the security mechanisms mentioned above, for example, SSL and TLS, have dependencies upon, or interact with, other network services and/or facilities such as the DNS, time services, firewalls, etc. These latter services and/or facilities have their own security considerations upon which Liberty-enabled systems are thus dependent

, , , , , , , , , , , , , ,

Leave a comment

%d bloggers like this: