If your software is supposed to perform in fractions of seconds, each line of code works against you.
Smart coding is about consuming least number of CPU cycles. If you are a latency addict, read on for some of the ideas.
(I plan to keep updating this post)
Language of choice
C/C++ is traditionally the language of choice. At low level, you can control and optimize the instructions well. Requirements can be varying and one does not always have this freedom. I have read a lot of text that JAVA is also a good alternative. This post is mostly focused on C/C++ ideas. There are good profiling tools available to identify bottlenecks.
Spawn threads for parallel functionality. If the whole design cannot be multi-threaded, small sub-modules can be spawned into threads frequently. Just be aware not to have too many threads. Stay close to the number of CPU cores (and double if hyper-threading). This way, save on thread context switching.
If your software works in a flow, eg. some kind of packet processing. Processes/threads should be designed to have buffers in-between. Say, you have to perform ten big operations on every packet, divide the functionality to ten. Have a dedicated module(process/thread) for every operation with buffers in between modules.
Based on requirements, software could have processes or modules running on different machines.
RAM comes cheaper than run-time. Whatever needs to be accessed frequently should be pre-allocated. Design smart data structures. For example, for a hash table implementation, block of 2 times the size of the expected elements is useful to avoid any collisions. Both insertion and acccess will be faster this way. Address lookups in networking software use this logic.
In C/C++, strings can be big run-time consumers, since there is a loop for every operation. Avoid string operations wherever they can be. Devote some time in designing good reg-ex handling algorithms if you need to do pattern matching stuff.
Choice of IPC
Brainstorm well on the IPC alternatives and then make a choice after experimentation.
I plan to write a separate post on the runtime analysis whenever I get time.
In multi-threading, take locks for shortest intervals. Avoid locks wherever they can be without screwing up the things 🙂