to restructure software by applying a series of refactorings without changing its observable behavior.
逐渐我也开始遇到这类问题，希望借鉴一些历史经验。有一本书《Refactoring: Improving the Design of Existing Code》，作者是Martin Fowler和Kent Beck，最早出版于1999年1。整理一些笔记于此。
Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. — Donald Knuth
Any fool can write code that a computer can understand. Good programmers write code that humans can understand.
What is it that makes programs hard to work with?
- Programs that are hard to read are hard to modify.
- Programs that have duplicated logic are hard to modify.
- Programs that require additional behavior that requires you to change running code are hard to modify.
- Programs with complex conditional logic are hard to modify.
Steps in refactoring
- Building tests.
- Changing the program in small steps, so it’s easy to trace bugs. Follow the rhythm: test, small change, test, small change…
- Never be afraid to rename things for clarity, especially internally2.
Most refactorings reduce the amount of code. If you happen to have one which increases it, think twice.
Quoted from Don Roberts
The first time you do something, you just do it. The second time you do something similar, you wince at the duplication, but you do the duplicate thing anyway. The third time you do something similar, you refactor. Three strikes and you refactor.
When do you need to refactor:
- when adding a function;
- when fixing a bug;
- when reviewing codes.
What you need to achieve with refactoring:
- To enable sharing of logic.
- To explain intention and implementation separately.
- To isolate change.
- I use an object in two different places. I want to change the behavior in one of the two cases. If I change the object, I risk changing both. So I first make a subclass and refer to it in the case that is changing. Now I can modify the subclass without risking an inadvertent change to the other case.
- To encode conditional logic.
Special Attention to Classes and Objects
- 有时候判断分支可以用polymorphism来替代。或者in a Julian way, multiple-dispatch，基于变量类型的由编译器决定的函数调用，而不是程序员手写的分支。
If a refactoring changes a published interface, you have to retain both the old interface and the new one, at least until your users have had a chance to react to the change. Fortunately, this is not too awkward. You can usually arrange things so that the old interface still works. Try to do this so that the old interface calls the new interface. In this way when you change the name of a method, keep the old one, and just let it call the new one. Don’t copy the method body—that leads you down the path to damnation by way of duplicated code. You should also use the deprecation facility in your programming language to mark the code as deprecated. That way your callers will know that something is up.
When Shouldn’t You Refactor?
There are times when you should not refactor at all. The principle example is when you should rewrite from scratch instead. There are times when the existing code is such a mess that although you could refactor it, it would be easier to start from the beginning. This decision is not an easy one to make, and there are no good guidelines for it.
Keep performance in mind, but as a general rule, cleaner code provides more space for optimization. Performance optimization often makes code harder to understand, but you need to do it to get the performance you need.
The interesting thing about performance is that if you analyze most programs, you find that they waste most of their time in a small fraction of the code. If you optimize all the code equally, you end up with 90 percent of the optimizations wasted, because you are optimizing code that isn’t run much. The time spent making the program fast, the time lost because of lack of clarity, is all wasted time.
One live example quoted from the book:
Our biggest improvement was to run the program in multiple threads on a multiprocessor machine. The system wasn’t designed with threads in mind, but because it was so well factored, it took us only three days to run in multiple threads.
Bad Smells in Code
- Long method
- Large class
- Long parameter list
- Data clump: bunches of data that hang around together really ought to be made into their own object
- Switch statements: alert when you see the same switch statement scattered about a program in multiple places.
- 我自己见过，也写过很多这样的例子。在实际操作中并非你一开始想象得那么简单：比如你最早写了一个绘图函数可以画等高线。后来你想扩展这个函数，可以改坐标单位。然而这个x、y轴的数据多处被用到，每一处用到的地方你都需要加一个有关单位的判断；然而按照原始的逻辑，把这些判断后的东西写在一起可能就比较奇怪。又比如Bart实现solid body的方法，非常的hacking，就是一个变量在原本算法逻辑中反复判断：这一步该不该打开，下一步该不该关上……最后的成品就是由无数个重复判断叠加在一起，对不对全看写的时候脑子清不清楚。
- Comments: many are misused as deodorant. It’s surprising how often you look at thickly commented code and notice that the comments are there because the code is bad.
- 最近的一个例子，Vlasiator中fix initial and boundary velocity block counts不一样的问题。之所以需要一大段注释，是因为代码本身逻辑混乱。
- A good time to use a comment is when you don’t know what to do, but not explain why you do poorly.
Here is a live example in Python 3.9 for bad smell in code:
Remember to always keep the code simple and self-descriptive, such that no extra explanatory comments are needed if possible.
Sometimes, refactoring is more about soft skills and decision making. I watched this interesting video
while thinking more about my experience in real working environment. Select who to work with may be more important than the work itself.