vendredi 2 septembre 2011

Veille technologique semaine 35

Pour le bulletin de cette semaine, je vous propose les sujets suivants :
  • une réflexion sur les compétences d'un développer.
  • Une synthèse des métriques des bugs dans les logiciels.
  • Le choix d'un OS pour les systèmes embarqués : Windows ou Linux ?
  • IBM a le record mondial de data center avec 120Po (120 millions de giga octets).
  • IBM propose le premier supercalculateur de 20P flops (20 millions de Giga flops) avec 100 000 processeurs, avec des transactions mémoires faite par le matériel : le processeur PowerPC. C'est un 64 bits à 18 coeurs avec 1,5 milliards de transistors, 1,6Ghz pour 205 Gflops à 55 Watts.
  • Les différents moyens de modulariser en Java
  • Le JSR (Java Specification Request) 107 au sujet d'un cache standard est (enfin) prévus pour JEE 7 (Java Enterprise Edition 7).
  • La version 2 du framework Java Disruptor : architecture de programmation concurrente pour consommer de 40 à 50 millions d'événements par seconde (hors réseau). Ceci démontre que les performances de dépendent pas de la technologie (langage Java dans cet exemple), mais plutôt de l'architecture (dynamique dans cet exemple).
  • Une proposition dévolution du langage pour le JDK 8 : les Virtual Extension Methods
  • Explication de la gestion automatique des ressources (ARM : Automatic Resource Management) du JDK 7 : un concept proposé par C# depuis longtemps, repris par le JDK 7. C'est un concept nécessaire au langages à garbage collector.
  • Un résumé sur les IO et les NIO en Java.
Bonne lecture.


Quel type de développeur êtes-vous ?
Etes-vous un expert ou un développeur sénior ? Quelle définition donnez-vous au mot « consultant sénior » ? La définition et la qualification d'un profil représente un gros travail. C'est un sujet tellement vaste, que je me suis dis : il faut que j'en parle avec eux. Bref me voilà de retour pour bloguer sur ce sujet…


How many bugs do you have in your code?
If you follow Zero Bug Tolerance of course you're not supposed to have any bugs to fix after the code is done. But let's get real. Is there any way to know how many bugs you're missing and will have to fix later, and how many bugs you might already have in your code? Are there any industry measures of code quality that you can use as a starting point?

On average 85% of bugs introduced in design and development are caught before the code is released (this is the average in the US as of 2009). His research shows that this defect removal rate has stayed roughly the same over 20 years, which is disappointing given the advances in tools and methods over that time.

We introduce 5 bugs per Function Point depending on the type of system being built (1 Function Point to
about 50-55 lines of Java code). Web systems are a bit lower surprisingly, at 4 bugs per Function Point; other internal business systems are 5, military systems average around 7.

Which means that for a small application of 1,000 Function Points (50,000 or so lines of Java code), you could expect around 750 defects at release.


Embedded operating systems: Linux versus Windows
Should you choose Linux or Windows for your next embedded project?
Before you begin an embedded project, you have a number of choices to make. Some of these decisions about hardware will be no-brainers based on the target device or the specific purpose of the embedded system. However, deciding on an operating system isn't always so clear cut. Assuming your development team is comfortable working with both Linux and MS Windows, you should closely examine the pros and cons of each OS to see which one will provide the most value for your project.


IBM assemble un centre de stockage de 120 Po
IBM vient de battre le record du monde en la matière en assemblant une unité de stockage de 120 Po, soit 120 millions de Gigaoctets. Ce système est composé de 200 000 disques durs à plateau (ce qui fait des unités à 600 Go).

Cette capacité colossale permettrait de stocker 60 copies de la plus grosse archive d'Internet existante ou encore 24 milliards de fichiers MP3 bien encodés.


Bien entendu l'exploit n'est pas tant de mettre 200 000 disques durs dans un même endroit mais bien de les gérer ensemble et aussi de régler le problème des pannes inévitables avec un tel nombre de produits. C'est là que se situe l'expertise qu'IBM veut mettre en avant. Tout est géré par des systèmes experts qui reconstruiront en tâche de fond et de manière totalement transparente les données des disques qui viendraient à mourir ce qui vu leur nombre
doit arriver plusieurs fois par jour.

Cet assemblage utilise un système de fichier appelé GPFS qui a été développé pour améliorer grandement l'accès aux données et les débits en créant des sortes de RAID 0 sur plusieurs grappes de disques et qui a permis, il y a peu de temps, de battre un autre record en indexant 10 milliards de fichiers en seulement 43 minutes.

Maintenant, IBM compte vendre ce type de solution aux possesseurs de supercalculateurs, mais aussi aux sociétés qui font du Cloud. Si cette pratique se généralise, il faudra en effet qu'ils aient des quantités difficiles à appréhender de stockage dans leurs centres de données.


IBM's new transactional memory: make-or-break time for multithreaded revolution
The BlueGene/Q processors that will power the 20 petaflops Sequoia supercomputer being built by IBM for Lawrence Livermore National Labs will be the first commercial processors to include hardware support for transactional memory.

Transactional memory could prove to be a versatile solution to many of the issues that currently make highly scalable parallel programming a difficult task.

Most research so far has been done on software-based transactional memory implementations. The BlueGene/Q-powered supercomputer will allow a much more extensive real-world testing of the technology and concepts. The inclusion of the feature was revealed at Hot Chips last week.


Modules, modules, modules …
I think everybody will agree that writing modular applications and modularity in general is a good thing. But how does support for modularity look like, both from the Java and Scala languages and various Java/Scala frameworks? There's a lot of different approaches! Let's look at some of them. Below "protection" means how well modules are separated either at compile-time or run-time.


JSR-107, JCache: Alive and Going to be Part of Java EE 7
Distributed caching is the tip of the spear for performance and scalability, yet Java does not have a completed standard caching mechanism yet. JSR- 107, the JCache API, is being actively worked on and will be included in Java EE 7. JSR-107 has gained some notoriety over the years because it is one of the older JSRs yet has never been completed, but given the increased demand for caching, JSR-107 will finally see the light of day.


Disruptor 2.0 Released
Significantly improved performance and a cleaner API are the key takeaways for the Disruptor 2.0 concurrent programming framework.

Performance
Significant performance tuning effort has gone into this release. This effort has resulted in a ~2-3X improvement in throughput depending on CPU architecture. For most use cases it is now an order of magnitude better than queue based approaches. On Sandybridge processors I've seen 40-50 million events processed per second.


Virtual Extension Methods (or, wedging multiple inheritance into the JVM)
Goals
• Encourage the creation of more abstract, highperformance libraries
• Secondary goal: encourage a more side-effect-free programming model
• Simplify the consumption of such libraries through a concise code-as-data mechanism
• Provide for better library evolution and migration
• Collections are looking long in the tooth
• Lambdas without broad library support would be disappointing
• Secondary goal: keep doors open
• Function types (but requires reification)
• Control abstraction (but lots of work needed to get there)


Garbage collection with Automatic Resource Management in Java 7
This post provides a brief overview of a new feature introduced in Java 7 called Automatic Resource Management or ARM. The post delves how ARM tries to reduce the code that a developer has to write to efficiently free the JVM heap of allocated resources.
One of the sweetest spots of programming in the Java programming language is automatic handling of object de-allocation. In Java world this is more popularly known as garbage collection; it basically means that developers do not have to worry about deallocating the object allocated by their code. As soon as a developer is finished with using the object he can nullify all references to the object and then the object becomes eligible for garbage collection.


Java NIO vs. IO
When studying both the Java NIO and IO API's, a question quickly pops into mind:
When should I use IO and when should I use NIO?
In this text I will try to shed some light on the differences between Java NIO and IO, their use cases, and how they affect the design of your code.