Smiley face

我的征尘是星辰大海。。。

The dirt and dust from my pilgrimage forms oceans of stars...

-------当记忆的篇章变得零碎,当追忆的图片变得模糊,我们只能求助于数字存储的永恒的回忆

作者:黄教授

二〇一九


一月五日等待变化等待机会

这是一个漫长而曲折的探索。问题也许并没有那么复杂,可是费了很多的周折。起因是我想在我制作的视频上打上一些警告文字,比如我并非有意侵犯版权之类的,这个问题很快找到了答案可以使用ffmpeg的drawtext:
ffmpeg -i /tmp/ffcee09d56dccd865e05f35f88b17412.mp4 -vf drawtext="text='星辰大海': fontcolor=red: fontsize=24: box=1: boxcolor=white@0.5: boxborderw=5: x=(w-text_w)/2: y=(h-text_h)/2" -codec:a copy /tmp/output.mp4
但是中文却不能正确显示,阅读drawtext的文档唯一的要求是text必须是utf-8编码,为了保证这一点我使用textfile参数来从文件来输入。那么问题就是怎样判断我的输入文件使用的是utf-8编码呢?传统的gedit有一个菜单可以选择编码,可是我使用的版本没有,这个路线就暂时放一放。那么怎样判断我的一个文件的编码呢?相对于iconv来说enca是好用多了,比如:
Enca is an Extremely Naive Charset Analyser. It detects character set and encoding of text files and can also convert them to other encodings using either a built-in converter or external libraries and tools like libiconv, librecode, or cstocs.
使用enca或者enconv的时候你需要提供的参数极少,默认是让它去猜测,不过给个提示有时候是必须的:
       The main reason why these command will fail and turn  your  files  into
       garbage  is that Enca needs to know their language to detect the encod‐
       ing.  It tries to determine your language and  preferred  charset  from
       locale settings, which might not be what you want.
       You can (or have to) use -L option to tell it the right language.  Sup‐
       pose, you downloaded some Russian HTML file, `file.htm', it claims it's
       windows-1251 but it isn't.  So you run
              enca -L ru file.htm
       and find out it's KOI8-R (for example).  Be warned, currently there are
       not many supported languages (see section LANGUAGES).
怎么理解呢?就是说依靠系统默认的locale来探测其编码,但是往往你的locale或者没有设置或者不是你的文件编码,那么这时候的提示就是必须的
nick@nick-HP-ZBook-15u-G2:~$ enca Documents/input.txt
enca: Cannot determine (or understand) your language preferences.
Please use `-L language', or `-L none' if your language is not supported
(only a few multibyte encodings can be recognized then).
Run `enca --list languages' to get a list of supported languages.
nick@nick-HP-ZBook-15u-G2:~$  LC_ALL=zh_CN.utf8  enca Documents/input.txt
Universal transformation format 8 bits; UTF-8
nick@nick-HP-ZBook-15u-G2:~$ 
为什么我在浪费时间呢?我最初的问题是什么?是关于osd的语言设置啊?原因是ffmpeg/drawtext依然不能正确输出中文,直到我看到这篇博客才明白其实还是命令行输入locale的设置问题,这个滚动字幕是非常不错的:
LANG=zh_CN.utf8 ffmpeg -y -i /tmp/ffceed56dccd865e05f35f88b17412.mp4 -vf "drawtext=fontcolor=green:fontsize=36:shadowy=4:\x='if(gte(t,1), (main_w-mod(t*30,main_w)), NAN)':y=(main_h-line_h-10):text='Hello, 滚动字幕我来了\!\!\!\!'" -codec:a copy /tmp/output.mp4

一月十四日等待变化等待机会

重新安装的ubuntu系统怎样安装libdvdcss呢?只有这个有效,因为我还没有搞明白为什么liddvdread/libdvdnav之类的可以选择性的去连接libdvdcss? sudo apt-get install ubuntu-restricted-extras。此外一个常识性的问题如此的不值一提,就是重新编译libdvdcss之类,(其实是任何项目)我遇到了错误是aclocal-1.xx错误,原因其实很简单,就是之前的ubuntu版本我使用了较低版本的automake已经配置过了这个configure,结果aclocal.m4已经是低版本的了,这个不是什么版本兼容的问题而是bootstrap产生的,其实很简单就是删除自动产生的文件,这个包括了aclocal.m4和configure,然后使用autoconf重新产生一遍。这个是一个如此常用而且我反复遇到的问题,我却始终有模糊的认识。另一个让人汗颜的问题是一个QA都知道的方法我却如同发现新大陆一般,如何去除一个数组的冗余元素呢?sort/unique啊!不过这个unique是仅仅去除你相邻的冗余,其实仅仅是放在数组的最后然后返回一个边界指针。所以unique_copy可以使用vector自己作参数
vector<string>::iterator it=unique_copy(vect.begin(),vect.end(),vect.begin());

一月十五日等待变化等待机会

dvd简单的拷贝是有问题的,应该要利用libdvdcss的解码机制,所以要使用dvdbackup来作这件事。我发现这个命令挺好用的:dvdbackup -i /dev/sr0 -o ./ -M

一月十七日等待变化等待机会

修改ubuntu的桌面需要tweak之类的工具,之前的gconf-editor之类的已经被淘汰了。今天终于意识到我之前的S3定期更新的程序运行其需要libcurl-openssl3的支持,这个是我升级到18.04所没有意识到的,错误特征是
/usr/lib/x86_64-linux-gnu/libcurl.so.4: version `CURL_OPENSSL_3' not found (required by /BigDisk/diabloforum/public_html/tools/myLibS3-link)
而直接安装似乎不成功看到这个才行:apt install libcurl3 libcurl-openssl1.0-dev
关于制作dvd为mp4文件我发现还是handbrakeCLI比较好用因为字幕语言的选择是一个很重要的优点,ffmpeg的参数怎样实现需要更多时间去发现所以这个是最简单的:
for title in {1..25}; do HandBrakeCLI -e x264 -E av_aac -i ./VIDEO_TS/ -Z "High Profile" -s 1 --subtitle-burned 1 --subtitle-default 1 -t ${title} -o ~/Videos/Veep/season2disc2/video_title_${title}.mp4; done
linux里不管多么细微的都是知识,比如使用find查找文件小于1M的时候如果你直接传参数-size -1M结果你肯定只能找到空文件,为什么呢?manpage里解说的很清楚是四舍五入,英文叫roundup,所以需要使用-size -1000k。同样的,对于bash处理字串我至今都是一知半解反反覆覆忘记,归根结底是regex的不了解熟悉比如我现在需要获得季节和光盘序列号如下:
nick@nick-KGP-M-E-D16:~$ dir="Veep S1 Disc2"; season=${dir#Veep S*}; season=${season% Disc*}; disk=${dir#*Disc*};echo "dir is *$dir* and season is *$season* and disc is *$disk*";
dir is *Veep S1 Disc2* and season is *1* and disc is *2*

一月二十九日等待变化等待机会

购买了Sherlock Holmes的全套dvd:

for i in {1..12}; do dir="Sherlock Holmes Disc ${i}"; titleNumber=$(lsdvd "${currentDir}/DVD/${dir}" 2>/dev/null| grep ^'Title: '|wc -l); for title in $(seq 1 ${titleNumber}); do HandBrakeCLI -e x264 -E av_aac -i "${currentDir}/DVD/${dir}" -Z "High Profile" -s 1 --subtitle-burned 1 --subtitle-default 1 -t ${title} -o ${newdir}/holms_title_${title}.mp4; done; done;
但是始终有什么问题,最后从并行运算的角度看也是需要同时运行多个任务更有效:
dir="Sherlock Holmes Disc 1"; titleNumber=$(lsdvd "${PWD}/DVD/${dir}" 2>/dev/null| grep ^'Title: '|wc -l); for title in $(seq 1 ${titleNumber}); do HandBrakeCLI -e x264 -E av_aac -i "${PWD}/DVD/${dir}" -Z "High Profile" -s 1 --subtitle-burned 1 --subtitle-default 1 -t ${title} -o ${PWD}/disc1/holms_title_${title}.mp4; done;

二月八日等待变化等待机会

关于vcmi的中文显示问题我基本上有了一些发现。首先,代码是支持utf8基础上的多国语言的。这一点是依靠转换其他encoding为utf8来实现的:boost::locale::conv::to_utf<char>(text,encoding);这里的"encoding"实际上是在vcmilauncher里的一个设置的所要支持的语言编码,比如gb2312,gbk等等。这个只能说代码支持字串的编码的统一处理,离真正的中文显示还有很长一段距离。其次,市面上的HOMM3有所谓的中文简体版,就是我之前使用的,这个应该是两个层面的构成,一部分是英文汉化的工作,比如英文翻译的部分,很多菜单也许直接就是做好的图标成为资源文件的一部分。另一方面的就是中文编码的支持,现在看来我的猜想是中文版还是使用utf8的编码而不是我原来猜想的gb2312之类的编码,原因是我下载的大量地图不但文件名字是使用utf8编码,而且地图文件内容应该也是utf8的编码,我虽然没有证据但是依据常理猜想应该如此,否则市面上的俄罗斯版本要怎么做呢?当然是utf8一种编码的一劳永逸了。这个是常识。对于文件名我是比较确定的因为我做了一个比较笨拙的测试,我把文件名用ls存在文件里,然后单单用file就可以看出它的编码,后来我也使用boost::locale::conv::to_utf<char>和boost::locale::conv::from_utf<char>做了正反两面的测试彻底否定了gb2312的编码,这个问题似乎很愚蠢,难道你用眼睛看不出字串编码吗?我还真的不知道。最后的一点发现是关于一个chinese truetype的mod的问题,这个是我在vcmilauncher上下载安装的mod,但是我感觉它的安装也许有问题,因为HOMM3的文字font是这样子的,它定义了9种字体,主要是字体大小的区别以便在不同场合显示,比如叙述文字,菜单等等,那么相应的多国语言实际上是在资源加载时候根据这个映射加载相应的文字的font,这个部分就是我目前需要解决的核心,因为我发现下载的中文mod的汉字font没有正确的配置,也没有正确的加载。第一配置是一个font.json的配置,
这个配置说明了我的9个字体的解释:

{
        // Original HoMM 3 bitmap fonts
        // Stored in H3Bitmap.lod with fnt extension
        // Warning: Do not change number of entries in this list
        "bitmap" :
        [
                "BIGFONT",  // 25px, Mostly used for window titles
                "CALLI10R", // 17px, Unused in VCMI
                "CREDITS",  // 31px, Used for credits menu
                "HISCORE",  // 25px, Unused in VCMI
                "MEDFONT",  // 20px, Some titles
                "SMALFONT", // 16px, Most of the messages
                "TIMES08R", // 15px, Used to display amounts on creature card 
                "TINY",     // 11px, Some text
                "VERD10B"   // 16px, Unused in VCMI
        ],
        // Chinese fonts, this list contains mapping to appropriate H3 fonts
        "bitmapHan" :
        {
                "BIGFONT" : { "name" : "hzk24_GBK", "size" : 24 },
                "CALLI10R": { "name" : "hzk10_GBK", "size" : 10 },
                "CREDITS" : { "name" : "hzk24_GBK", "size" : 24 },
                "HISCORE" : { "name" : "hzk12_GBK", "size" : 12 },
                "MEDFONT" : { "name" : "hzk12_GBK", "size" : 12 },
                "SMALFONT": { "name" : "hzk12_GBK", "size" : 12 },
                "TIMES08R": { "name" : "hzk10_GBK", "size" : 10 },
                "TINY"    : { "name" : "hzk10_GBK", "size" : 10 },
                "VERD10B" : { "name" : "hzk10_GBK", "size" : 10 }
        },
        // True type replacements
        // Should be in format:
        // <replaced bitmap font name, case-sensetive> : <true type font description>
        // "file" - file to load font from, must be in data/ directory
        // "size" - point size of font
        // "style" - italic and\or bold, indicates font style
        // "blend" - if set to true, font will be antialiased
        "trueType":
        {
                //"BIGFONT"  : { "file" : "LiberationSerif-Bold.ttf", "size" : 22, "blend" : true},
                //"CALLI10R" : { "file" : "Georgia.ttf",       "size" : 10},
                //"CREDITS"  : { "file" : "LiberationSerif-Bold.ttf", "size" : 28},
                //"HISCORE"  : { "file" : "Georgia.ttf",       "size" : 13},
                //"MEDFONT"  : { "file" : "LiberationSerif-Bold.ttf", "size" : 16}, // breaks messages (from map events)
                //"SMALFONT" : { "file" : "LiberationSerif-Regular.ttf",       "size" : 13, "blend" : true},
                //"TIMES08R" : { "file" : "LiberationSerif-Regular.ttf", "size" :  11, "blend" : true},
                //"TINY"     : { "file" : "LiberationSerif-Regular.ttf",       "size" : 11, "blend" : true},
                //"VERD10B"  : { "file" : "Georgia.ttf",       "size" : 13}
        }
}
注意这个配置是我从那个chinese truetype font mod拷贝出来的,首先,这个config没有正确的和vcmi的font.json合并。(关于vcmi的config的正确路径我要承认我是一头雾水,首先我自己编译安装的话路径在/opt/local/share/vcmi,但是我的home目录下的.local/share/vcmi也有一个安装,我怀疑是我使用package manager安装的官方版本,但是在$HOME/.cache/share/vcmi下也有一个,这个似乎是系统制作的缓存版本,我反复安装,后来只能手动更新,也许fc-cache的作用可以更新?再次在$HOME/.vcmi也有一个配置,我一开始自己编译的版本运行失败似乎是需要这么一个空文件配置。总之我目前对于程序配置的正确路径是非常的混乱。其次资源文件hzk12_GBK也不可能正确拷贝到了data/路径下因为连配置文件都没有整合甚至没有找到在哪里!此外从以上注解来看使用"name"和"file"似乎大有学问,我的猜想name是某种资源的标识?因为file是肯定的资源文件。但是我疑惑的是文件的路径,似乎那些ttf文件放在系统的font路径下也可以?但是我打开过这些所谓的ttf文件,她们就是一个简单的英文字符的图片大概有一套算法可以定位图片上每个字母的位置,总共几十个字符这个不是什么大不了的。相比之下中文动辄上千的字符编码这个绝对是小儿科的玩意。以上就是我学习到的关于vcmi的中文显示问题,总之我认为显示需要的字符集的代码支持不是问题,问题在于中文的font文件没有正确加载。

二月二十日等待变化等待机会

阅读大师Bjarne Stroustrup的著作我感觉我就是一个第一天上课的小学生,如果我可以这么断言c++和c是两种完全不同的语言的话,那么c++11和c++98更加的如此。这里的不同是一个目的在于强调的略微夸张,但是我自己是这么认为的。对于很多的新语言特性我是如同看天书一般的瞠目结舌,因为以前限于编译器的羁绊现在都打开了新天地。
这里我原原本本照抄了大师的小例子显示一个printf既能够保证参数类型的安全也能够检查参数的正确的个数,这些往往都是黑客瞄准的漏洞。其中的精微奥妙之处非细致到每一个字符绝无法体会:

#include <iostream>
#include <exception>
using namespace std;
void myprintf(const char*s)
{
	if (s == nullptr) return;
	while (*s)
	{
		if (*s == '%' && *++s != '%' ) //elegant!!!
		{
			throw runtime_error("invalid format: missing argument");
		}
		cout << *s++;
	}
}
template <typename T, typename... Args>
void myprintf(const char*s, T value, Args... args)
{
	while (s && *s)
	{
		if (*s == '%' && *++s != '%')
		{
			cout << value;
			return myprintf(++s, args...);
		}
		cout << *s++;
	}
	throw runtime_error("extra argument is provided to print");
}
int main()
{
	myprintf("first is int %s, second is str %s, third is double %d, fourth is pointer %p", "first string",
			23, 5.66, NULL);
	return 0;
}
这里的基础型是只有format的字符串,其中%%是标记只输出特殊字符%的形式,这里你看明白了其中的大师代码的精炼了吗?我是佩服的五体投地。对于模板参数的省略府typename... Args我也是第一次接触,大师给的名称叫做parameter pack就是你可以像剥洋葱一样peel off的一层层展开。基于模板设计的递归定义实际上已经涵盖了所有的cout<<能够支持的类型,在编译期就做了参数类型支持的检查,当然参数类型的“提示”符是被忽略了,因为人是会犯错误的,既然我们交给cout的支持又何必去提示呢?任何的%x都是一个placeholder仅仅作为参数的提示,也就是参数个数的定义。所以,这里的所谓的类型检查在编译期就做了的是错误的,联想一下gcc是原本可以在编译期就给出错误报告就知道这个实现是和正规的printf有区别的,因为%就是真正的一个"placeholder",你使用什么类型提示符都是一样的,这一点我重读笔记是忘记了,现在实验才明白。

二月二十三日等待变化等待机会

debug core dump可能所有程序员最头疼之一吧,S发现了一个已知的ipmitool的bug,就是使用memmove的问题,其中的size如果是负数导致segment fault,其中的原因我认为是size是unsigned,负数导致巨大的内存空间的移动,要么访问violation要么整个内存内容的corruption。另一个问题是关于os设定的stack size是一个进程的上限,那么pthread创建的线程是否全部stack的总和也受这个限制呢?实际上google就发现linux的线程LWP和进程本质上差不多,所以不存在说初始线程的stack size决定其他线程的stack size总和的概念,彼此没有什么本质的主从,总而言之,这个ulimit看到的适用于所有线程。所以我看到的一遇到800多个请求就crash的原因并不是stack overflow,当然总的heap size倒是有可能吗?我查了一下也不可能因为heap决定于每个程序的最大内存空间大小,这个经常都是设置为“无限”。所以是别的原因。

三月七日等待变化等待机会

一觉醒来我战战兢兢再也不敢说我懂c++编程了,因为c++11和c++98有着巨大的区别,几乎是一种全新的语言,我几乎每一个小节都发现陌生的语法与用意。比如,delete作为函数声明时候类似与声明虚函数一样可以强制编译器不要生成default,同样的default可以明确指示编译器使用默认的函数生成。比较=0和=delete还是有巨大区别的,前者是明确它是虚函数,后者不允许有,我现在还没有看到是否c++11里关于虚函数是否也改变了声明方式。所以我又一次老生常谈式的纠结于在c++98下人们的种种的workaround是否能够找到一一对应的c++11的新语法,我很想有这样一本书来指导省却很多的查找时间,也许找一找c++沿革历史的书可以有些眉目,比如一个类似于changelist这样的总结与介绍?对于constructor为什么不能是虚函数这里的解释再好没有了。我以前很瞧不起baidu认为他的搜索结果过于功利化,而且太偏向于中文网页,但是在国内的这几天被迫使用下发现在编程领域里的归类处理上百度似乎做的更好,当然另一种解释是这是人工干预的结果,总之,这些引用我的觉得相当的好,故而收藏一下。为了阅读方便,我略略修改了一些格式和字句
为什么构造函数不可以是虚函数
  1. 从存储空间角度

    虚函数对应一个vtable(虚函数表),这大家都知道,可是这个vtable其实是存储在对象的内存空间的。问题出来了,如果构造函数是虚的,就需要通过 vtable来调用,可是对象还没有实例化,也就是内存空间还没有,无法找到vtable,所以构造函数不能是虚函数。
  2. 从使用角度

    虚函数主要用于在信息不全的情况下,能使重载的函数得到对应的调用。构造函数本身就是要初始化实例,那使用虚函数也没有实际意义呀。所以构造函数没有必要是虚函数。虚函数的作用在于通过父类的指针或者引用来调用它的时候能够变成调用子类的那个成员函数。而构造函数是在创建对象时自动调用的,不可能通过父类的指针或者引用去调用,因此也就规定构造函数不能是虚函数。
  3. 构造函数不需要是虚函数,也不允许是虚函数,但析构函数则不然

    因为创建一个对象时我们总是要明确指定对象的类型,尽管我们可能通过实验室的基类的指针或引用去访问它。但析构却不一定,我们往往通过基类的指针来销毁对象。这时候如果析构函数不是虚函数,就不能正确识别对象类型从而不能正确调用析构函数。
  4. 从实现上看,vbtl在构造函数调用后才建立,因而构造函数不可能成为虚函数

    从实际含义上看,在调用构造函数时还不能确定对象的真实类型(因为子类会调父类的构造函数);而且构造函数的作用是提供初始化,在对象生命期只执行一次,不是对象的动态行为,也没有太大的必要成为虚函数
  5. 当一个构造函数被调用时,它做的首要的事情之一是初始化它的V P T R。

    • 因此,它只能知道它是“当前”类的,而完全忽视这个对象后面是否还有继承者。当编译器为这个构造函数产生代码时,它是为这个类的构造函数产生代码- -既不是为基类,也不是为它的派生类(因为类不知道谁继承它)。
    • 所以它使用的V P T R必须是对于这个类的V TA B L E。而且,只要它是最后的构造函数调用,那么在这个对象的生命期内, V P T R将保持被初始化为指向这个V TA B L E, 但如果接着还有一个更晚派生的构造函数被调用,这个构造函数又将设置V P T R指向它的 V TA B L E,等.直到最后的构造函数结束。V P T R的状态是由被最后调用的构造函数确定的。这就是为什么构造函数调用是从基类到更加派生类顺序的另一个理由。
    • 但是,当这一系列构造函数调用正发生时,每个构造函数都已经设置V P T R指向它自己的 V TA B L E。如果函数调用使用虚机制,它将只产生通过它自己的V TA B L E的调用,而不是最后的V TA B L E(所有构造函数被调用后才会有最后的V TA B L E)。
为什么析构函数可以是虚函数
--------------------- 
作者:cainiao000001 
来源:CSDN 
原文:https://blog.csdn.net/cainiao000001/article/details/81603782 
版权声明:本文为博主原创文章,转载请附上博文链接!
实际上我还尝试了一些很愚蠢的可能性,比如基类的constructor是否可以是不定义的(=delete),或者基类的destructor是否也可以被屏蔽掉(=delete),事实证明不行,因为如果你使用了基类的类型被子类继承就隐含着在子类的constructor里要调用基类的constructor,这个调用可以是protected,这是通常的在c++98里没有新的=delete的变通的方法,当然隐藏不等于消除,两者是不同的,可是对于destructor连隐藏在protected也不行,因为我在调用子类的destructor的时候使用delete其指针就隐含的调用基类的destructor,难道我可以显式的调用吗?答案是明确的,这是必须的,否则就会使用到基类的destructor,而这个是被隐藏在protected里的不希望被调用的,当然,这个做法变得毫无实际意义,因为使用不方便的话何必要去隐藏?目的何在呢?

class Base
{
protected:
        Base() = default;
        virtual ~Base()  = default;
public:
        Base(const Base& other) = delete;
};
class Derived : public Base
{
public:
        Derived(){cout << __PRETTY_FUNCTION__ << endl;}
        virtual ~Derived(){cout << __PRETTY_FUNCTION__ << endl;}
  //void operator delete(void* ptr){cout << __PRETTY_FUNCTION__ << endl;}//你并不需要重载不过知道重载的形式很有益的 
};
int main()
{
        Base* ptr = new Derived{};
        delete dynamic_cast<Derived*>(ptr);
  //delete ptr;//这里默认调用的是global的delete operator,她只是使用其基本类型Base的destructor,内部再去层层调用子类的  
        return 0;
}

三月二十三日等待变化等待机会

我错误的升级到ubuntu18.04发现了很多莫名其妙的问题,比如eclipse for c++不能用了,这里是我找到的办法使用安装程序来安装而不是直接运行package
  1. sudo apt install -y eclipse-cdt-*
  2. 
    $ URL=https://www.eclipse.org/downloads/download.php
    $ ECLIPSE=/oomph/epp/oxygen/R/eclipse-inst-linux64.tar.gz
    $ MIRROR=1
    $ wget -q -O eclipse-inst-linux64.tar.gz \
      "${URL}?file=${ECLIPSE}&mirror_id=${MIRROR}"
    $ tar zxf eclipse-inst-linux64.tar.gz
    $ ./eclipse-installer/eclipse-inst
    $ sudo sed -i /usr/share/applications/eclipse.desktop \
      -e "s;^Exec=eclipse;Exec=${HOME}/eclipse/cpp-oxygen/eclipse/eclipse;g"
    
  3. 我发现了一个新问题,就是说ubuntu18.04的默认的java11似乎运行有问题,于是我只能试图改变默认的虚拟机因为网上说要java9我无法安装。所以,eclipse.ini改变如下:
    
    --launcher.appendVmargs
    -vm
    #/usr/lib/jvm/java-11-openjdk-amd64/bin
    /usr/lib/jvm/java-8-openjdk-amd64/bin

三月二十九日等待变化等待机会

这个是反复都要做的因为每次安装ubuntu都需要配置的,输入法我现在选择fcitx。大概的步骤是:
  1. 首先安装googlepinyin包:sudo apt install -y fcitx-googlepinyin
  2. 在region/language的设置里要把中文安装
  3. 在fcitx configure里配置googlepinyin
那么配置中文作为默认的locale是这样子的
  1. 首先制作中文的locale,sudo locale-gen --lang zh_CN.UTF-8
  2. 其次是设置为默认的locale: sudo update-locale LANG=zh_CN.UTF-8

四月一日等待变化等待机会

centos相比ubuntu明显的高级,当然复杂稳定也限制了它接纳最新的版本的软件。对于grub2的默认启动选项,我一直找不到所谓的grub2-editenv,后来只能直接编辑grubenv文件本身就是把启动项字串直接拷贝替代。

四月七日等待变化等待机会

我的新笔记本使用amd ryzen 5 2500u结果在ubuntu 18.04的标准内核4.18下总是死机,找了很久才发现这个内核参数似乎能够解决:idle=nomwait mce=off amdgpu.runpm=0 processor.max_cstate=1 intel_idle.max_cstate=0 pci=noacpi pci=noaer rcu_nocbs=0-7 我并不知道那一个是必须的,也许idle=nomwait mce=off?

四月十二日等待变化等待机会

Smiley face
同学发来的问题据说有人20分钟回答,我对此颇感怀疑实际上并不需要十分钟因为这是一个五分钟的问题,我之前对于RSA的质数的位数的认识有误,那些都是700多位的质数,对于十位数来说根本不用一秒,编程求解质数分解先放一下,下面这个问题是可以二十分钟求解的:在1到707829217之间的所有奇数总共有多少个数字3?这个实际上是一个排列组合的问题,可以看出除了第一位和最后一位其他7位都有十个可能的选择就是0到9,其中0如果作为这整个数字的首位可以忽略也是合法的,比如00123457,07等等。
  1. 所以我们可以一位一位来计算,首先计算个位数出现3的奇数总共有多少个。个位是3,首位总共有8个选择因为0到7这里就是关键的错误,因为整个数字不能超过707829217,所以第一位是7的时候是不同的要单独计算,其余7位每个都有10个选择。所以总共是8x10x10x10x10x10x10x10X1=8x10^7
  2. 现在计算十位数出现3的个数。各位总共有5个选择就是1,3,5,7,9,十位数自然就是1,首位依然是8个选择,其他6位数每个还是10个选择,所以,总共8x10^6x1x5=4x10^ 7。其他除首位以外都是一样的,所以,所有7位出现3的总数都是7x4x10^7=28x10^7问题还是首位是7的时候其他不能超过707829217所以不对
  3. 首位出现3,那么其他7位依旧是每个10个选择,个位是5个选择,所以总共10^7x5 这个结论还是正确的。
  4. 现在汇总就是8x10^7+28x10^7+5x10^7=41x10^7
美女给我的回答是不对,同时我写的最苯的一个数字一个数字来验证的程序跑了一晚上也没有结束的迹象看来是错了。重新思考。
重新思考依然不对,因为如果首位是0的情况是错误的,比如003和03实际上是一个数字而我重复计算了,所以,要再思考。看来这个真的不是20分钟的问题,我写的一个无比简单的验证程序居然跑了两天依然没有结果,用投机取巧的办法是不行的,就是写程序解决也要优化算法才行。 把问题分为这么三大类子问题,a)首位出现3的情况,b)个位出现3的情况,c)其他除首位个位以外每个位出现3的情况:
看来普通的计数的程序速度太慢了,要重新想办法。我怎么又一次错了,我在eclipse的debug界面下运行看来速度太慢了,居然在命令行得到了结果:368247332

int countThree(int n)
{
        int result=0;
        do
        {
                if (n % 10 == 3)
                {
                        result ++;
                }
                n = n/10;
        }
        while (n > 0);
        return result;
}
int main()
{
        int total = 0;
        for (int i = 1; i <= 707829217; i+= 2)
        {
                total += countThree(i);
                cout <<".";
        }
        cout << endl << total << endl;
        return 0;
}
看来我心算还是很差因为对于估计计算量非常的差,这么个小程序需要多少时间运行我估计的太远了: 8171 * 86627。

        for (int i = 2; i <=707829217; i++)
        {
                if (707829217%i == 0)
                {
                        cout << i << " * " << 707829217/i << endl;
                        break;
                }
        }

四月十九日等待变化等待机会

一个人懒起来是没有边际的。昨天我突然想尝试一下linuxmint因为同事K说他和他太太都是用这个据说比ubuntu还要傻瓜化,我并不是想更换我的目前的ubuntu18.04,只是在看了youtube的评论想体验一下。结果就有了问题,因为我想仅仅尝试一下本来要么是usb要么是livecd,前者我尝试了一个很久以前下载的做法在usb persistent和full installation尝试失败了,我怀疑是驱动或者什么问题,于是尝试livecd,这个时候似乎才意识到我下载的也许有些问题。总之,我发现了我以前写的grub.cfg里不可饶恕的错误,比如search --no-floppy --fs-uuid --set=root 66e7ef64-61a6-4642-a98d-2dd23c14103a这句话究竟做了什么--set=root究竟是什么意思呢?我很可笑的居然又去使用
      set isofile="/media/debian-live-9.5.0-amd64-gnome.iso"
        set isolabel="d-live 9.5.0 gn amd64"
        search --no-floppy --fs-uuid --set 3e00216e-8aa8-4080-a622-3f98ae5921d0
        loopback loop (hd0,5)$isofile
注意(hd0,5),之所以原来使用这个就是有大问题,因为寄希望与每次启动的时候scsi的设备顺序不变,所以我才使用uuid来摆脱这个问题,结果search uuid并且设定root就是已经去掉了使用scsi设备号的烦恼,而且我还发现partition的选择导致我目前使用的似乎是gpt,于是以前用(hd0,1)之类的现在变成了(hd0,gpt1)之类的,我猜想以前使用的是msdos的分区,现在改成了gpt的分区?所以正确的做法就是
     set isofile="/home/nick/Downloads/linuxmint-17.1-cinnamon-oem-64bit.iso"                
        search --no-floppy --fs-uuid --set=root 66e7ef64-61a6-4642-a98d-2dd23c14103a    
        loopback loop $isofile
        linux (loop)/casper/vmlinuz file=(loop)/preseed/linuxmint.seed boot=casper iso-scan/filename=$isofile
        initrd (loop)/casper/initrd.lz
所以,在随后的loopback loop语句我直接就使用本分区的文件路径不再需要设备号。不过目前的问题是后面casper的启动还不成功。祖师爷说过read the fucking code!
于是打开initrd.lz看看:lzma -dc initrd.lz | cpio -iv在看init脚本时候看到了一个简单的不能再简单的问题,之前我总是看到别人写这样的
 # Parse command line options
for x in $(cat /proc/cmdline); do
        case $x in
        init=*)
                init=${x#init=}
                ;;
        root=*)
                ROOT=${x#root=}
                case $ROOT in
                LABEL=*)
                        ROOT="${ROOT#LABEL=}"
                        # support any / in LABEL= path (escape to \x2f)
                        case "${ROOT}" in
                        */*)
                        if command -v sed >/dev/null 2>&1; then
                                ROOT="$(echo ${ROOT} | sed 's,/,\\x2f,g')"
                        else
                                if [ "${ROOT}" != "${ROOT#/}" ]; then
                                        ROOT="\x2f${ROOT#/}"
                                fi
                                if [ "${ROOT}" != "${ROOT%/}" ]; then
                                        ROOT="${ROOT%/}\x2f"
                                fi
                                IFS='/'
                                newroot=
                                for s in $ROOT; do
                                        newroot="${newroot:+${newroot}\\x2f}${s}"
                                done
                                unset IFS
                                ROOT="${newroot}"
                        fi
                        esac
                        ROOT="/dev/disk/by-label/${ROOT}"
                        ;;
                UUID=*)
                        ROOT="/dev/disk/by-uuid/${ROOT#UUID=}"
                        ;;
                /dev/nfs)
                        [ -z "${BOOT}" ] && BOOT=nfs
                        ;;
                esac
                ;;
        rootflags=*)
                ROOTFLAGS="-o ${x#rootflags=}"
                ;;
        rootfstype=*)
                ROOTFSTYPE="${x#rootfstype=}"
                ;;
        rootdelay=*)
                ROOTDELAY="${x#rootdelay=}"
                case ${ROOTDELAY} in
                *[![:digit:].]*)
                        ROOTDELAY=
                        ;;
                esac
                ;;
        resumedelay=*)
                RESUMEDELAY="${x#resumedelay=}"
                ;;
        loop=*)
                LOOP="${x#loop=}"
                ;;
        loopflags=*)
                LOOPFLAGS="-o ${x#loopflags=}"
                ;;
        loopfstype=*)
                LOOPFSTYPE="${x#loopfstype=}"
                ;;
        cryptopts=*)
                cryptopts="${x#cryptopts=}"
                ;;
        nfsroot=*)
                NFSROOT="${x#nfsroot=}"
                ;;
        netboot=*)
                NETBOOT="${x#netboot=}"
                ;;
        ip=*)
                IP="${x#ip=}"
                ;;
        boot=*)
                BOOT=${x#boot=}
                ;;
        ubi.mtd=*)
                UBIMTD=${x#ubi.mtd=}
                ;;
        resume=*)
                RESUME="${x#resume=}"
                ;;
        resume_offset=*)
                resume_offset="${x#resume_offset=}"
                ;;
        noresume)
                noresume=y
                ;;
        panic=*)
                panic="${x#panic=}"
                case ${panic} in
                *[![:digit:].]*)
                        panic=
                        ;;
                esac
                ;;
        quiet)
                quiet=y
                ;;
        ro)
                readonly=y
                ;;
        rw)
                readonly=n
                ;;
        debug)
                debug=y
                quiet=n
                exec >/run/initramfs/initramfs.debug 2>&1
                set -x
                ;;
        debug=*)
                debug=y
                quiet=n
                set -x
                ;;
        break=*)
                break=${x#break=}
                ;;
        break)
                break=premount
                ;;
        blacklist=*)
                blacklist=${x#blacklist=}
                ;;
        netconsole=*)
                netconsole=${x#netconsole=}
                ;;
        BOOTIF=*)
                BOOTIF=${x#BOOTIF=}
                ;;
        hwaddr=*)
                BOOTIF=${x#BOOTIF=}
                ;;
        recovery)
                recovery=y
                ;;
        esac
done
这里面的信息量有多少呢?
我居然问题从这个开始init=${x#init=}这里的x#init=是什么意思?哈哈好笑吧?google的解答是这样子的。也就是说#是最短的吻合,##是最长的吻合,这里是例子:
nick@nick-AMD-Laptop:/tmp$ x=/a/b/c/d
nick@nick-AMD-Laptop:/tmp$ echo ${x##*/}
d
nick@nick-AMD-Laptop:/tmp$ echo ${x#*/}
a/b/c/d
nick@nick-AMD-Laptop:/tmp$ basename $x
d
其次我们看到boot=casper这个参数传递给通用的init脚本就会呼叫/scripts/casper。而某些参数是直接传递的给它的。那么打开casper看到了什么呢?我现在理解了一些,比如cdrom就是mount在/cdrom这个目录下,这个回答了我看到的grub.cfg里的这个东西。

四月二十二日等待变化等待机会

grub里的参数是传递给谁的呢?原来是给casper的,目的是什么呢?编译grub2似乎以前就做过一次,发现了另一个有趣的问题,原因是我没有安装pkg-config但是configure的错误却似乎是需要重新生成autoconf的东东。总之,错误是很missleading的。

四月三十日等待变化等待机会

早上起来自我感觉不错,散步时候把自己的id改成了“不畏浮云遮望眼,只缘身在此山巅”。然后看了半小时的驱动。其实这个只是基本技能的一部分,对于硬件的内部接口只有硬件公司内部团队才有可能啊。为加深记忆总结一下学到的基本原理: 这里是一组开心一笑的搞笑视频,好像都是SUSE的。 一个人的懒惰有多种表现形式,比如我的亚马逊上传程序很古老需要runtime的libcurl,可是升级到ubuntu18.04后runtime不再兼容,于是我就降级了,可是直接的结果是基本的curl不再能够安装。对于这个简单的问题照例说我应该重新编译一下就可以了,可是我却懒得做,于是我决定重新编译curl,这个实在是非常的无理。

五月三日等待变化等待机会

昨天闹了一个笑话,就是QA需要upgrade BIOS/FW那么怎么boot呢?我从legacy模式的经验出发当然是要制作一个bootable usb,这个实在是罗嗦,windows你要去弄一个winpe,很麻烦。linux要去下载一些流行的distro的image,问题是办公室里的laptop我已经把vmware的ubuntu之类删掉了因为磁盘空间的原因,所以,实在是罗嗦,后来W说到了uefi的选项,我其实从来没有用uefiboot过,在uefi的boot mode下选择bbs的子菜单可以直接boot到uefi shell,这个是最容易的解决方案,这个是常识我因为始终都是很抵制uefi,所以真是可笑啊。新办公室里有很多的问题,先是公司新的防火墙的政策导致subversion的访问有问题,K给了一个办法就是用所谓的ssh的remote port forwarding,因为我们的putty发起的laptop是被允许通过防火墙的,那么远程的服务器绕回来才行。其次ntp服务器改掉了,同事的VM总是会自己修改时间,我发现即便正确设置了ntpd依然时间被篡改,使用ntpd -qg这个手动可以修正,但是随后又被改回去了非常的奇怪。需要设置一个openldap服务器,非常的复杂,我看了半天还是不太明白如何配置,看来跟安全相关的东东都是比较复杂的。
一个老故事就是我的ubuntu升级了内核之后不能boot,那么我要修改grub菜单指定启动选项是老的内核,那么修改GRUB_DEFAULT为什么不成功呢?这个问题有人问过,我也遇到过。大家都犯的错误是以为这个是grub.cfg的参数,其实这个是/etc/default/grub的参数,所以,你要修改然后update-grub重新生成。不过难题是关于子菜单的选择,要去使用2>2还是1>2之类的,我搞不清楚grub默认菜单是否把recovery mode放在了advanced的子菜单里是否就改变了菜单的序列,也就是说现在ubuntu advanced实际上是菜单的第二个,那么他的子菜单的选项也就变了。我最后只好使用菜单的title:

五月七日等待变化等待机会

关于HOMM3的multiplayer/wine的问题是这样子的,我始终都记不住这个文件的名字:
Note on the HD mod: The HD mod puts its own dpwsockx.dll file into the Heroes3 folder, which overrides the system one. This causes hangs when trying to host. Once this file is removed or replaced with the one from the DirectX redistributable, the HD mod works fine online too.

五月二十一日等待变化等待机会

最近很多的风云变幻,看到新闻说习近平特地来到江西于都向红军长征的出发地敬献了一个花圈,我猜想他此时的心境一定非常的复杂,对于国家和个人来说都是要下一个关键的战略决策的决心。前两天遇到一个简单的问题,就是新安装的centos启动后网络不通,后来才意识到这个是我设置的static ip没有设定启动,也就是说在systemctl的设置中的network-scripts/ifcfg.sh里需要设置onboot=yes,当然这个脚本的文件名字需要是网卡的名字。至于说没有网络我为什么能够访问,当然是通过bmc的remote console了。

五月二十七日等待变化等待机会

这个是工作中遇到的问题,关于ipv6 local link address的问题十分的清晰。这个ipmi spec其实直接可以下载的到的,只不过Intel的版本老是不让你拷贝。这个是保密的材料关于optane的,如果出现披漏纯粹是亚马逊S3的权限问题与我无关

六月四日等待变化等待机会

再重复一遍怎样制作dvd,

六月十一日等待变化等待机会

工作中要写一个最最简单的dynamic-loading的client来呼叫一个动态库的函数,原本以为我已经对于这个了如指掌,结果还是费了一番思量。对于这个初级教程自然是再熟悉不过,但是一个显而易见的问题是dlopen如何解决调用的动态库的自身的依赖的其他库呢?当然我这里指的依赖绝不是系统目录下的动态库,而是和调用的动态库一起的其他库。显而易见的解决方法是在调用前设置环境变量LD_LIBRARY_PATH,当然对于dlopen的flag这里我就不需要特别的考量了,因为以前碰到的多重依赖而使用RTLD_NOW|RTLD_DEEPBIND,这个组合纯粹如同静态链接的动态实现,意思就是把所有动态连接的优越性统统抛弃,这个做法当时我是遇到不同寻常的情况就是多个不同版本的动态库要同时加载,也就是说相同名称的symbol在多个dlopen的动态库里并存,这个无论如何都是程序员的噩梦,当时使用ld.so来debug是极其的痛苦。如今这个问题是一个无比简单的问题。当然为了简单起见我的加载函数被我声明成了一个全局函数而不是某个类的成员函数,这个极大的简化了dlopen的工作,否则你想过怎样动态加载一个类吗?你看到的标准答案是使用类工厂,可是懒惰的人没有给你类工厂你还想调用它的类的成员函数呢?这个是很痛苦的,我以前曾经想要逆向一个库测试一下它的一个函数,所谓逆向当然是没有头文件的情况下靠objdump猜测其中的调用,首先我只能使用类的constructor了,这个有问题吗?如果你想一下就该想起来类的成员函数包括constructor都需要一个this指针指向一块内存来作为函数的隐含的第一个参数,我怎么去动态调用constructor呢?我没有头文件怎么知道类的实例的大小呢?后来我就瞎猜了一个足够大的内存给constructor来生成,反正是测试,内存越大越好。成员变量在编译的时候是偏移量,但是成员函数其实和地地道道的namespace下的函数没有什么特别的区别,当然需要一个隐含的this指针参数,所以你是可以去调用的。当然我现在的问题是简单多了,我不过需要把我的全局函数包裹在某个namespace里,这里我犯了一个低级错误,你以为你把函数声明在某个namespace里在实现里使用using namespace xxx;就可以了吗?那个仅仅是使用namespace,你要把他的函数的实现包裹在namespace里才对啊。这个似乎是常见的错误。当然对于symbol的mangle我觉得没有什么解决办法,这个是编译器自行决定的而且说不定也会变,也许依靠cpp mangle的库可以生成,但是太过麻烦了所以我就只能hardcode那个objdump里看到的symbol了。

六月十三日等待变化等待机会

上回书说到动态库的加载我犯了一个基本的错误,那就是我依然需要使用LD_LIBRARY_PATH,这个是因为我加载的动态库本身所依赖的其他的库必须要能够被dl找到,那么存在于非系统搜索目录下是无法找到的。那么我在dlopen之前自行设置这个路径不就行了吗?想法简单实际不行,因为dlopen只有在进程开始时候检查这个环境变量一次,所以在运行期改变是为时已晚。那么我再运行一下自己吧,开始的时候我把问题搞得很复杂又是用fork,又是想pipe,可是后来想到我其实只需要exec就行了。
  1. 关于安装wine其实需要我首先升级一下刚刚安装的ubuntu18.04。
  2. sudo dpkg --add-architecture i386
  3. wget -qO - https://dl.winehq.org/wine-builds/winehq.key | sudo apt-key add -
  4. sudo apt-add-repository 'deb https://dl.winehq.org/wine-builds/ubuntu/ bionic main'
  5. sudo apt-get update
  6. sudo apt-get install --install-recommends winehq-stable
我想要改变顶栏显示时分秒,那么需要下载gname-tweaks。

六月十四日等待变化等待机会

关于安装homm3的步骤是这样子的:
我工作中遇到一个奇怪的问题,就是我希望对于一个exception做一些处理然后再抛出,可是出乎意料的是我传递的类实际上是子类,throw的时候是当作父类,那么我原本期待catch的时候是按照子类,因为实际的类是子类。可是结果是你抛出是什么类就是什么类,查看c++解说才明白所谓的throw和一个constructor一样,至少是copy constructor一样。

#include <iostream>
#include <exception>
using namespace std;
class ParentEx: exception
{
public:
	string m_name;
	ParentEx()
	{
		m_name = "parent";
	}
};
class ChildEx : public ParentEx
{
public:
	string m_strWhat;
	ChildEx()
	{
		m_name ="child";
	}
};
void addMore(ParentEx& ex)
{
	ex.m_name += "added";
	throw ex;
}
int main()
{
	try
	{
		ChildEx ex;
		addMore(ex);
	}
	catch (ParentEx& parent)
	{
		cout << "parent name:"<< parent.m_name << endl;
	}
	catch (ChildEx& child)
	{
		cout << "child name:" << child.m_name << endl;
	}
	return 0;
}
执行的结果就是:parent name:childadded
核心就是我之前对于throw的真正的机制有一种模糊的认识,其实现在依然是,不过现在明白了基本的就是其存储机制是不公开的,也就是用户不许也不必要知道,而且是一个copy的或者c++11的所谓move机制。当然throw;是java的rethrow;就是说无参数就是rethrow。
看到一个简单的关于btrfs的检查文件变化的命令,就是sudo btrfs subvol find-new ${path} ${gen_id} | sed '$d' | cut -f17- -d' ' | sort | uniq 其中两个参数分别是新的snapshot的subvol的路径,后者是旧的snapshot的gen id。怎样获得一个snapshot的gen id呢?依旧是使用find-new不过你传入的gen-id是一个极大的数就行了。对于恢复snapshot的机制我的理解就是通过set-default来引导reboot的文件系统的root吧。对于subvolume的机制我应该感谢我有接触过nas的snapshot机制的解说,所以,mount point对于我来说很好理解,因为是copy on write的底层实现可以创造出多个基于不同的inode的文件系统。只要记住这个snapshot是可读写的,我在想是否我可以自己chroot而不需要重新启动?
关于wireshark我觉得这个是陈词滥调的重复了,无非就是安装的时候要enable普通用户使用dumpcap的权限,但是这个机制是通过创建了一个wireshark的group实现的,所以,我需要把我自己加入到这个组: sudo usermod -a -G wireshark $USER 。当然你需要logout: gnome-session-quit --logout。 不过我发现我遇到了另一个问题就是我之前把我的locale从中文又改回了英文,然后google拼音不再能够使用了,我不太清楚这个是怎么回事,不过好像在设置玩中文后我又要运行一下fcitx。不过也许更好的选择是安装搜狗拼音

六月二十七日等待变化等待机会

对于boost我一直有一种畏难情绪,似乎stl就足够了,可是实际上使用c++11之类的我感觉更加不适应,所以,昨晚决定还是每天抽时间学习一下,反正工作上也时不时的需要。我以为我的学习主要是浮光掠影式的熟悉,那么只需要了解这么几个方面:
  1. include的头文件是什么
  2. 是处于哪一个名字空间
  3. 需要链接的动态库是什么
  4. 其中的主要类是什么,如何创建,如何显示以便验证
  5. 主要的方法是什么,如何应用,其中的算法如何等等
那么就从最最简单的文件系统开始吧:
  1. 你只需要include <boost/filesystem.hpp>这个是非常的方便因为这个头文件包含了你需要的所有其他头文件,这种门户设计很贴切。
  2. 其中的using namespace boost::filesystem;很贴切,只不过其中和std的有些冲突,因此建议不要在using namespace std;了。
  3. 其中最基本的类是path,这个可以看作是一个字符串的包裹,可以直接显示字符串,它可以被另外两个最主要的类来使用,一个是file_status这个包裹了文件的各种特性,另一个是directory_entry,后者包含了前者因为directory不过是一种特殊的文件,前者是后者这个类似于container的元素而已,那么主要的操作都是围绕着后者的,比如有一个directory_iterator接受path可以遍历其中的所有文件,iterator的边界采用default constructor,比如
    
            for (directory_iterator it(current_path()); it != directory_iterator(); it ++)
            {
                    std::cout << *it << std::endl;
            }
  4. directory_iterator是一个类似平面的iterator,另一个recursive_directory_iterator更加的好用,它基本上是一个深度优先的遍历算法,它有depth和level我还不知道具体是什么,不过猜想是可以控制遍历的深度,同时好像constructor里还可以附带其他选项,因此原本在winapi里一个相当复杂的遍历回调函数编程的模式变成了一个简单的循环。此外还有很多的方便的函数比如判断文件类型的等等。

七月十日等待变化等待机会

我选择的第二个小目标是regex,一方面它相当的具有独立性,工作上也时常使用,另一方面它非常的有用我也总有很多困惑。结果磨磨蹭蹭断断续续始终进展缓慢,不过其中的原因其实也是有的,首先我要重新温习regex的知识,之前就花过几天时间学习过perl,对于regex实在是非常的浅薄,在youtube上学了几个小时再来看boost的代码依旧犯了不少错误。大致上是这样子的:
  1. 头文件是boost/regex.hpp,这个包含了需要使用的几乎所有头文件,比如regex_match.hpp,regex_split.hpp等等
  2. namespace倒是简单,就是boost,同时你需要动态库boost_regex
  3. 这里有一个插曲就是如何编译libboost,比如我要编译静态库regex来debug看看为什么throw exception,那么以下就是一个样板:
    1. ./bootstrap.sh --prefix=<where you want to install libboost> --with-libraries=regex
    2. ./b2 install link=static threading=single variant=debug
  4. 具体的使用其实还是有不少的部分的,最简单的是regex_split的使用,你几乎可立刻就明白怎么使用。比如你只是使用默认的delimiter,那么你只需要使用back_inserter来帮助存储你的结果。vector<string> vect; regex_split(back_inserter(vect),str);
  5. regex_search似乎也比较好理解,关键是你要定义好你的regex,比如我就是需要寻找一个个字串那么regex ex("\\w+");我们可以实现类似regex_split的功能,这个小技巧就是使用最最重要的result_match,其中如果使用string就是smatch,使用char*就是cmatch,而所谓的result_match就是包装了结果和指针的复杂的类,我目前发现这个result[0]可以直接使用作为full match,而result[0].second实际上是Interator,我们可以用result[0].second operator->()返回当前搜索停止的指针。
    void searchTest()
    {
            const string str = "this is a sentence of many words.";
            regex ex("\\w+");
            smatch what;
            string::const_iterator start = str.begin(), end = str.end();
            while (regex_search(start, end, what, ex))
            {
                    cout << what[0] << endl;
                    cout << "second:" << what[0].second.operator ->() << endl;
                    start = what[0].second;
            }
    }
  6. 我似乎始终对于regex_match有误解,比如我碰到了boost抛出的错误:The complexity of matching the regular expression exceeded predefined bounds. Try refactoring the regular expression to make each choice made by the state machine unambiguous. This exception is thrown to prevent "eternal" matches that take an indefinite period time to locate.造成这个错误的是这么个定义regex ex("(\\w+\\s?)+");我的猜想是google中说的挂号内外都有+号就是问题。以下regex ex("((\\w|\\s)+)\\.");虽然也可以match实际上非常的宽泛。
    
    void matchTest()
    {
            const string str = "this is a sentence of many words.";
            //regex ex("((\\w|\\s)+)\\.");
            regex ex("((\\w+\\s?)+)\\.");
            smatch what;
            if (regex_match(str, what, ex))
            {
                    cout << what[0] << endl;
            }
    }

七月十一日等待变化等待机会

对于boost的bind我是老也转不过弯子来,这个是相当的高难度的。

#include <iostream>
#include <boost/bind.hpp>
#include <boost/function.hpp>
using namespace std;
using namespace boost;
void asyncRead(boost::function<void(int, int)> handler)
{
	handler(3, 4);
}
class Client
{
public:
	Client(){;}
	void Start()
	{
		mem_fun(&Client::Foo)(this);
		bind1st(mem_fun(&Client::MyHandler), this)(78);
		boost::function<void (int)> fun1(bind(&Client::MyHandler, this, _1));
		fun1(100);
		bind(&Client::MyHandler, this, 24)();
		bind(&Client::MyHandler, this, _1)(29);
		boost::function<void(int, int)> fun2(bind(&Client::MyHandler2, this, _1, _2));
		asyncRead(fun2);
		asyncRead(bind(&Client::MyHandler2, this, _1, _2));
	}
	void MyHandler(int errCode)
	{
		cout << "errCode:" << errCode << endl;
	}
	void MyHandler2(int err, int number)
	{
		cout << "***err:" << err << endl
				<< "number:" << number << "***"<< endl;
	}
	void Foo()
	{
		cout << "Foo" << endl;
	}
};
int main()
{
	Client client;
	client.Start();
	return 0;
}
以下是运行结果,那么你就明白程序是如何的了吧。
Foo
errCode:78
errCode:100
errCode:24
errCode:29
***err:3
number:4***
***err:3
number:4***

七月十四日等待变化等待机会

关于asio实际上是相当的复杂,工作中始终遇到很多困扰,现在从头学起,这个是几乎最著名的起始的例子,同时也包含的最主要的元素,我们也是用这个例子拓展的,只不过我总是遇到async_read的锁死的问题,希望用这个简单的例子来方便debug,因为异步的问题实际上是相当的复杂。
#include <iostream>
#include <istream>
#include <ostream>
#include <string>
#include <boost/asio.hpp>
#include <boost/bind.hpp>
using boost::asio::ip::tcp;
class client
{
public:
	client(boost::asio::io_service& io_service, const std::string& host,
			const std::string& port,
			const std::string& path) :
			resolver_(io_service), socket_(io_service)
	{
		// Form the request. We specify the "Connection: close" header so that the
		// server will close the socket after transmitting the response. This will
		// allow us to treat all data up until the EOF as the content.
		std::ostream request_stream(&request_);
		request_stream << "GET " << path << " HTTP/1.0\r\n";
		request_stream << "Host: " << host << "\r\n";
		request_stream << "Accept: */*\r\n";
		request_stream << "Connection: close\r\n\r\n";
		// Start an asynchronous resolve to translate the server and service names
		// into a list of endpoints.
		tcp::resolver::query query(host, port);
		resolver_.async_resolve(query,
				boost::bind(&client::handle_resolve, this,
						boost::asio::placeholders::error,
						boost::asio::placeholders::iterator));
	}
private:
	void handle_resolve(const boost::system::error_code& err,
			tcp::resolver::iterator endpoint_iterator)
	{
		if (!err)
		{
			// Attempt a connection to the first endpoint in the list. Each endpoint
			// will be tried until we successfully establish a connection.
			tcp::endpoint endpoint = *endpoint_iterator;
			socket_.async_connect(endpoint,
					boost::bind(&client::handle_connect, this,
							boost::asio::placeholders::error,
							++endpoint_iterator));
		}
		else
		{
			std::cout << "Error: " << err.message() << "\n";
		}
	}
	void handle_connect(const boost::system::error_code& err,
			tcp::resolver::iterator endpoint_iterator)
	{
		if (!err)
		{
			// The connection was successful. Send the request.
			boost::asio::async_write(socket_, request_,
					boost::bind(&client::handle_write_request, this,
							boost::asio::placeholders::error));
		}
		else if (endpoint_iterator != tcp::resolver::iterator())
		{
			// The connection failed. Try the next endpoint in the list.
			socket_.close();
			tcp::endpoint endpoint = *endpoint_iterator;
			socket_.async_connect(endpoint,
					boost::bind(&client::handle_connect, this,
							boost::asio::placeholders::error,
							++endpoint_iterator));
		}
		else
		{
			std::cout << "Error: " << err.message() << "\n";
		}
	}
	void handle_write_request(const boost::system::error_code& err)
	{
		if (!err)
		{
			// Read the response status line.
			boost::asio::async_read_until(socket_, response_, "\r\n",
					boost::bind(&client::handle_read_status_line, this,
							boost::asio::placeholders::error));
		}
		else
		{
			std::cout << "Error: " << err.message() << "\n";
		}
	}
	void handle_read_status_line(const boost::system::error_code& err)
	{
		if (!err)
		{
			// Check that response is OK.
			std::istream response_stream(&response_);
			std::string http_version;
			response_stream >> http_version;
			unsigned int status_code;
			response_stream >> status_code;
			std::string status_message;
			std::getline(response_stream, status_message);
			if (!response_stream || http_version.substr(0, 5) != "HTTP/")
			{
				std::cout << "Invalid response\n";
				return;
			}
			if (status_code != 200)
			{
				std::cout << "Response returned with status code ";
				std::cout << status_code << "\n";
				return;
			}
			// Read the response headers, which are terminated by a blank line.
			boost::asio::async_read_until(socket_, response_, "\r\n\r\n",
					boost::bind(&client::handle_read_headers, this,
							boost::asio::placeholders::error));
		}
		else
		{
			std::cout << "Error: " << err << "\n";
		}
	}
	void handle_read_headers(const boost::system::error_code& err)
	{
		if (!err)
		{
			// Process the response headers.
			std::istream response_stream(&response_);
			std::string header;
			while (std::getline(response_stream, header) && header != "\r")
				std::cout << header << "\n";
			std::cout << "\n";
			// Write whatever content we already have to output.
			if (response_.size() > 0)
				std::cout << &response_;
			// Start reading remaining data until EOF.
			boost::asio::async_read(socket_, response_,
					boost::asio::transfer_at_least(1),
					boost::bind(&client::handle_read_content, this,
							boost::asio::placeholders::error,
							boost::asio::placeholders::bytes_transferred));
		}
		else
		{
			std::cout << "Error: " << err << "\n";
		}
	}
	void handle_read_content(const boost::system::error_code& err, std::size_t byte_read)
	{
		if (!err)
		{
			// Write all of the data that has been read so far.
			std::cout << &response_;
			// Continue reading remaining data until EOF.
			boost::asio::async_read(socket_, response_,
					boost::asio::transfer_at_least(1),
					boost::bind(&client::handle_read_content, this,
							boost::asio::placeholders::error,
							boost::asio::placeholders::bytes_transferred));
		}
		else if (err != boost::asio::error::eof)
		{
			std::cout << "Error: " << err << "\n";
		}
	}
	tcp::resolver resolver_;
	tcp::socket socket_;
	boost::asio::streambuf request_;
	boost::asio::streambuf response_;
};
int main(int argc, char* argv[])
{
	try
	{
		if (argc != 4)
		{
			std::cout << "Usage: async_client <server> <port> <path>\n";
			std::cout << "Example:\n";
			std::cout << "  async_client www.boost.org 8080 /LICENSE_1_0.txt\n";
			return 1;
		}
		boost::asio::io_service io_service;
		client c(io_service, argv[1], argv[2], argv[3]);
		io_service.run();
	}
	catch (std::exception& e)
	{
		std::cout << "Exception: " << e.what() << "\n";
	}
	return 0;
}

七月十七日等待变化等待机会

昨天又闹了一个笑话,我直觉的以为async_connect的所谓endpoint_iterator是指的是本地的interface,实际上根本不是,具体使用哪一个interface去连接似乎是完全由操作系统决定的,或者说routing table里有一个所谓的metric数值决定的。那么这个iterator指的究竟是什么呢?另一个错误印象是async_read_until,我完全忘记了这个是网络的接受根本不可能像文集流一样可以控制一个字节一个字节的读取甚至peek再吐回去,所以,当它返回的时候几乎必然的有额外的数据,这个当然是一点醒就明白的道理。而其中的很多玄机在于boost的所谓的schedule.ipp这个代码文件里的运作,至今我依然搞不清楚event loop是如何运作的,我Post了的究竟是handler还是work?我现在理解就是每个async_xxx的函数都是所谓的post一个work,当然其中连带的把work的handler也加了进queue里,那么io_service的run也就是在不停的循环往复运行任务和其handler,其中唤醒线程的方法应该就是所谓的pthread_condition_xxx之类的办法。但是工作中的非常类似sample code的运行似乎总有一个任务最后死等在什么地方。应该说boost的代码非常的复杂难以理解。还有一个常识我始终都忘记就是Http header的特殊字符的问题,原则上你只能使用地一个:来区分header,其余部分直到\r\n\r\n,而http header结束也只能使用一个空行来确定,当然之前的status line也是如此的使用\r\n\r\n来结束,而且我们只能期待HTTP/1.X statuscode message\r\n\r\n这样的形式。这个似乎太基本了可是我始终有很多的模糊认识,比如以前在header传递密码的时候密码本身有特殊字符怎么办呢?按照这个原则其实无所谓的因为:并不是header限制的特殊字符,只有\r\n才是。

七月十八日等待变化等待机会

终于找到了问题的根源,就是所谓的async_read_until这个函数的错误理解,看上去这个很明确就是读取的数据停止在你设置的delimiter的条件,但是网络传输的性质决定了要满足这个条件实际上你接受到的数据肯定是超越了这个边界条件了,正常也无妨因为反正都是在streambuf里,可是我们有一个矛盾的做法就是在发起http request的时候要求keepalive,而通常你都是发出connection:close的请求,也就是说服务器方发完就可以关闭接受方也就简单了自动就可以推出,可是keepalive之后如何才能退出呢?除了自然等待服务器主动断线只有另一个办法就是根据content-length来决定什么时候数据完整了。那么这个时候我们使用async_read的condition条件是boost::asio::transfer_exactly(length)。那么这个length必须减去之前已经读到streambuf里的部分,否则就和transfer_at_least(1)一样了。另一个实在是不应该的错误是不可原谅的那就是exception也是遵循类的基本原理就是继承类和父类是兼容的,那么catch的顺序是有决定意义的。我对于throw的机制开始怀疑后对于exception的兼容性也开始怀疑了。

七月二十一日等待变化等待机会

找到一个很好的帖子是关于exception实现机理的讨论,这个省却了自己在gcc源代码里寻找,而且我也没有这个能力现在能够迅速找到源代码的实现部分,我也只需要一个原则的理解,那么何必呢?不过那个用setjmp之类的实现还是需要很多时间来学习,以前在poco里看到过一个类似的做法,严格的说不是类似,而是把signal转化为exception,这个实际上更加的困难,因为你需要回到当初signal发生的stack然后在哪里重新释放exception。

七月二十三日等待变化等待机会

这个是我在练习使用boost把zip文件解压缩。这里再强调一下,就是使用path和directory_iterator来遍历其中的文件,使用一些所谓的operation比如is_regular来判断文件属性,而文件名扩展名等是path的属性,这个需要熟悉。此外,algorithm里的iequals/ifind之类是case insensitive的算法。这个头文件boost/algorithm/string.hpp里有很多的需要熟悉的有用的算法。

#include <iostream>
#include <boost/filesystem.hpp>
#include <boost/algorithm/string.hpp>
using namespace std;
using namespace boost::filesystem;
using namespace boost::algorithm;
int main(int argc, char** argv)
{
        if (argc != 2)
        {
                cout << "usage: " << argv[0] << " <path>" << endl; 
                return -1;
        }
        path p(argv[1]);
        for (directory_iterator it = directory_iterator(path(argv[1])); it != directory_iterator(); it ++)
        {
                if (is_regular(it->path()))
                {
                        if (it->path().has_extension() && iequals(it->path().extension().string(), ".zip"))
                        {
                                string strCmd = "unzip -o ";
                                strCmd += it->path().string();
                                strCmd += " -d /home/nick/Downloads/game";
                                system(strCmd.c_str());
                        }
                }
        }
        return 0;
}

七月二十四日等待变化等待机会

读自己以前的笔记也是一种莫大的乐趣,而且我在回忆中不时的惊奇的发现我曾经是多么的伟大啊,难道我以前比现在更加的伟大?比如看到去年七月十七日的笔记我一点也想不起来原来set有这样的const_iterator的问题,而看到去年六月十一日的小程序写的多好啊,我居然可以把回调函数作为参数来调用不同的方法,这样的奇技淫巧简直是达到了一个登峰造极的地步。
这个是我看到的关于stateless ipv6的设置,原因是我的设备的ipv4总是随着重启而改变,那么有没有一个办法永久设置呢?我说的不是每个设备的static ipv4而是说router方面,看来是有的,但是设置界面比较复杂我设过一次又忘记了。目前我还缺乏基本的知识学习这部分内容,先保留在这里。

IPv6 Stateless Address Autoconfiguration

A host performs several steps to autoconfigure its interfaces in IPv6. The autoconfiguration process creates a link-local address. The autoconfiguration process verifies its uniqueness on a link. The process also determines which information should be autoconfigured, addresses, other information, or both. The process determines if the addresses should be obtained through the stateless mechanism, the stateful mechanism, or both mechanisms. This section describes the process for generating a link-local address. This section also describes the process for generating site-local and global addresses by stateless address autoconfiguration. Finally, this section describes the procedure for duplicate address detection.

Stateless Autoconfiguration Requirements

IPv6 defines mechanisms for both stateful address and stateless address autoconfiguration. Stateless autoconfiguration requires no manual configuration of hosts, minimal (if any) configuration of routers, and no additional servers. The stateless mechanism enables a host to generate its own addresses. The stateless mechanism uses local information as well as non-local information that is advertised by routers to generate the addresses. Routers advertise prefixes that identify the subnet or subnets that are associated with a link. Hosts generate an interface identifier that uniquely identifies an interface on a subnet. An address is formed by combining the prefix and the interface identifier. In the absence of routers, a host can generate only link-local addresses. However, link-local addresses are only sufficient for allowing communication among nodes that are attached to the same link.

Stateful Autoconfiguration Model

In the stateful autoconfiguration model, hosts obtain interface addresses or configuration information and parameters from a server. Servers maintain a database that checks which addresses have been assigned to which hosts. The stateful autoconfiguration protocol allows hosts to obtain addresses and other configuration information from a server. Stateless and stateful autoconfiguration complement each other. For example, a host can use stateless autoconfiguration to configure its own addresses, but use stateful autoconfiguration to obtain other information.

When to Use Stateless and Stateful Approaches

The stateless approach is used when a site is not concerned with the exact addresses that hosts use. However, the addresses must be unique. The addresses must also be properly routable. The stateful approach is used when a site requires more precise control over exact address assignments. Stateful and stateless address autoconfiguration can be used simultaneously. The site administrator specifies which type of autoconfiguration to use through the setting of appropriate fields in router advertisement messages.

IPv6 addresses are leased to an interface for a fixed, possibly infinite, length of time. Each address has an associated lifetime that indicates how long the address is bound to an interface. When a lifetime expires, the binding, and address, become invalid and the address can be reassigned to another interface elsewhere. To handle the expiration of address bindings gracefully, an address experiences two distinct phases while the address is assigned to an interface. Initially, an address is preferred, meaning that its use in arbitrary communication is unrestricted. Later, an address becomes deprecated in anticipation that its current interface binding becomes invalid. When the address is in a deprecated state, the use of the address is discouraged, but not strictly forbidden. New communication, for example, the opening of a new TCP connection, should use a preferred address when possible. A deprecated address should be used only by applications that have been using the address. Applications that cannot switch to another address without a service disruption can use a deprecated address.

Duplicate Address Detection Algorithm

To ensure that all configured addresses are likely to be unique on a particular link, nodes run a duplicate address detection algorithm on addresses. The nodes must run the algorithm before assigning the addresses to an interface. The duplicate address detection algorithm is performed on all addresses.

The autoconfiguration process that is specified in this document applies only to hosts and not routers. Because host autoconfiguration uses information that is advertised by routers, routers need to be configured by some other means. However, routers probably generate link-local addresses by using the mechanism that is described in this document. In addition, routers are expected to pass successfully the duplicate address detection procedure on all addresses prior to assigning the address to an interface.

Autoconfiguration Process

This section provides an overview of the typical steps that are performed by an interface during autoconfiguration. Autoconfiguration is performed only on multicast-capable links. Autoconfiguration begins when a multicast-capable interface is enabled, for example, during system startup. Nodes, both hosts and routers, begin the autoconfiguration process by generating a link-local address for the interface. A link-local address is formed by appending the interface's identifier to the well-known link-local prefix.

A node must attempt to verify that a tentative link-local address is not already in use by another node on the link. After verification, the link-local address can be assigned to an interface. Specifically, the node sends a neighbor solicitation message that contains the tentative address as the target. If another node is already using that address, the node returns a neighbor advertisement saying that the node is using that address. If another node is also attempting to use the same address, the node also sends a neighbor solicitation for the target. The number of neighbor solicitation transmissions or retransmissions, and the delay between consecutive solicitations, are link specific. These parameters can be set by system management.

If a node determines that its tentative link-local address is not unique, autoconfiguration stops and manual configuration of the interface is required. To simplify recovery in this instance, an administrator can supply an alternate interface identifier that overrides the default identifier. Then, the autoconfiguration mechanism can be applied by using the new, presumably unique, interface identifier. Alternatively, link-local and other addresses need to be configured manually.

After a node determines that its tentative link-local address is unique, the node assigns the address to the interface. At this point, the node has IP-level connectivity with neighboring nodes. The remaining autoconfiguration steps are performed only by hosts.

Obtaining Router Advertisement

The next phase of autoconfiguration involves obtaining a router advertisement or determining that no routers are present. If routers are present, the routers send router advertisements that specify what type of autoconfiguration a host should perform. If no routers are present, stateful autoconfiguration is invoked.

Routers send router advertisements periodically. However, the delay between successive advertisements is generally longer than a host that performs autoconfiguration can wait. To obtain an advertisement quickly, a host sends one or more router solicitations to the all-routers multicast group. Router advertisements contain two flags that indicate what type of stateful autoconfiguration (if any) should be performed. A managed address configuration flag indicates whether hosts should use stateful autoconfiguration to obtain addresses. An other stateful configuration flag indicates whether hosts should use stateful autoconfiguration to obtain additional information, excluding addresses.

<

Prefix Information

Router advertisements also contain zero or more prefix information options that contain information that stateless address autoconfiguration uses to generate site-local and global addresses. The stateless address and stateful address autoconfiguration fields in router advertisements are processed independently. A host can use both stateful address and stateless address autoconfiguration simultaneously. One option field that contains prefix information, the autonomous address-configuration flag, indicates whether the option even applies to stateless autoconfiguration. If the option field does apply, additional option fields contain a subnet prefix with lifetime values. These values indicate how long addresses that are created from the prefix remain preferred and valid.

Because routers generate router advertisements periodically, hosts continually receive new advertisements. Hosts process the information that is contained in each advertisement as described previously. Hosts add to the information. Hosts also refresh the information that is received in previous advertisements.

Address Uniqueness

For safety, all addresses must be tested for uniqueness prior to their assignment to an interface. The situation is different for addresses that are created through stateless autoconfiguration. The uniqueness of an address is determined primarily by the portion of the address that is formed from an interface identifier. Thus, if a node has already verified the uniqueness of a link-local address, additional addresses need not be tested individually. The addresses must be created from the same interface identifier. In contrast, all addresses that are obtained manually should be tested individually for uniqueness. The same is true for addresses that are obtained by stateful address autoconfiguration. Some sites believe that the overhead of performing duplicate address detection outweighs its benefits. For these sites, the use of duplicate address detection can be disabled by setting a per-interface configuration flag.

To accelerate the autoconfiguration process, a host can generate its link-local address, and verify its uniqueness, while the host waits for a router advertisement. A router might delay a response to a router solicitation for a few seconds. Consequently, the total time necessary to complete autoconfiguration can be significantly longer if the two steps are done serially.

对于香港这样的殖民地奴才当权者太缺乏施政的手段了,如果当年邓公没有那么早仙去怎么会让香港沦落到如此地步?没有雷霆手段何谈什么菩萨心肠???

七月二十八日等待变化等待机会

花了不少时间才明白了一个浅显的道理,那就是boost的deadline timer实际上要取消掉才能导致io_service的run结束,意思就是timer本身就是一个task,这个问题的出现是因为我在asio的每个handler的回调函数开始的时候重置timer的expire这个是很好的因为重置就是取消上一次的等待,可是如果回调函数是在错误出现的时候我没有取消timer的话就导致timer一直要到它expire才能让io_context结束运行。
这里是我最喜欢的网络短剧《guild》的几个MTV,这是总共六个season的网络连续喜剧,主角兼创作者用自己的亲身经历演绎了一群有趣而又充满喜剧色彩的小故事,对白生动有趣,但是绝对不容易看懂,尤其对于不熟悉《魔兽世界》或者在线游戏的非英语观众,其实就算英语是母语,主人公们的语言也是一种外语。 我强烈推荐支持这群真正的艺术家们的创作,不为别的,纯粹为了艺术。如果你不能捐助或者购买DVD,那么至少观看官方频道来支持他们:www.watchtheguild.com

七月二十九日等待变化等待机会

准备学习《孙子兵法》(作者:孙武),这里是全文
  1. 始计篇

    孙子曰:兵者,国之大事,死生之地,存亡之道,不可不察也。   

    故经之以五事,校之以计,而索其情:一曰道,二曰天,三曰地,四曰将、五曰法。道者,令民与上同意也,故可以与之死,可以与之生,而不畏危。天者,阴阳,寒暑、时制也。地者,远近、险易、广狭、死生也。将者,智、信、仁、勇、严也。法者,曲制、官道、主用也。凡此五者,将莫不闻,知之者胜,不知者不胜。故校之以计,而索其情,曰:主孰有道?将孰有能?天地孰得?法令孰行?兵众孰强?士卒孰练?赏罚孰明?吾以此知胜负矣。   

    将听吾计,用之必胜,留之;将不听吾计,用之必败,去之。计利以听,乃为之势,以佐其外。势者,因利而制权也。   

    兵者,诡道也。故能而示之不能,用而示之不用,近而示之远,远而示之近;利而诱之,乱而取之,实而备之,强而避之,怒而挠之,卑而骄之,佚而劳之,亲而离之。攻其无备,出其不意。此兵家之胜,不可先传也。   

    夫未战而庙算胜者,得算多也;未战而庙算不胜者,得算少也。多算胜,少算不胜,而况于无算乎?吾以此观之,胜负见矣。

  2. 作战篇

    孙子曰:凡用兵之法,驰车千驷,革车千乘,带甲十万,千里馈粮,则内外之费,宾客之用,胶漆之材,车甲之奉,日费千金,然后十万之师举矣。   

    其用战也胜,久则钝兵挫锐,攻城则力屈,久暴师则国用不足。夫钝兵挫锐,屈力殚货,则诸侯乘其弊而起,虽有智者,不能善其后矣。故兵闻拙速,未睹巧之久也。夫兵久而国利者,未之有也。故不尽知用兵之害者,则不能尽知用兵之利也。   

    善用兵者,役不再籍,粮不三载;取用于国,因粮于敌,故军食可足也。国之贫于师者远输,远输则百姓贫。近于师者贵卖,贵卖则百姓财竭,财竭则急于丘役。力屈、财殚,中原内虚于家。百姓之费,十去其七;公家之费,破车罢马,甲胄矢弩。戟楯蔽橹,丘牛大车,十去其六。   

    故智将务食于敌。食敌一钟,当吾二十钟;芑秆一石,当吾二十石。   

    故杀敌者,怒也;取敌之利者,货也。故车战,得车十乘已上,赏其先得者,而更其旌旗,车杂而乘之,卒善而养之,是谓胜敌而益强。   

    故兵贵胜,不贵久。故知兵之将,生民之司命,国家安危之主也。

  3. 谋攻篇

    孙子曰:夫用兵之法,全国为上,破国次之,全军为上,破军次之;全旅为上,破旅次之;全卒为上,破卒次之;全伍为上,破伍次之。是故百战百胜,非善之善者也;不战而屈人之兵,善之善者也。   

    故上兵伐谋,其次伐交,其次伐兵,其下攻城。攻城之法为不得已。修橹轒辒,具器械,三月而后成,距堙,又三月而后已。将不胜其忿而蚁附之,杀士三分之一而城不拔者,此攻之灾也。   

    故善用兵者,屈人之兵而非战也,拔人之城而非攻也,毁人之国而非久也,必以全争于天下。故兵不顿而利可全,此谋攻之法也。   

    故用兵之法,十则围之,五则攻之,倍则分之,敌则能战之,少则能逃之,不若则能避之。故小敌之坚,大敌之擒也。   

    夫将者,国之辅也。辅周,则国必强;辅隙,则国必弱。   

    故君之所以患于军者三:不知军之不可以进而谓之进,不知军之不可以退而谓之退,是谓“縻军”;不知三军之事,而同三军之政者,则军士惑矣;不知三军之权,而同三军之任,则军士疑矣。三军既惑且疑,则诸侯之难至矣,是谓“乱军引胜”。   

    故知胜有五:知可以战与不可以战者胜,识众寡之用者胜,上下同欲者胜,以虞待不虞者胜,将能而君不御者胜。此五者,知胜之道也。   

    故曰:知彼知己者,百战不殆;不知彼而知己,一胜一负,不知彼,不知己,每战必殆。

  4. 军形篇

    孙子曰:昔之善战者,先为不可胜,以待敌之可胜。不可胜在己,可胜在敌。故善战者,能为不可胜,不能使敌之可胜。故曰:胜可知,而不可为。   

    不可胜者,守也;可胜者,攻也。守则不足,攻则有余(竹简为:守则有余,攻则不足)。善守者,藏于九地之下,善攻者,动于九天之上,故能自保而全胜也。   

    见胜不过众人之所知,非善之善者也;战胜而天下曰善,非善之善者也。故举秋毫不为多力,见日月不为明目,闻雷霆不为聪耳。古之所谓善战者,胜于易胜者也。故善战者之胜也,无智名,无勇功。故其战胜不忒,不忒者,其所措必胜,胜已败者也。故善战者,立于不败之地,而不失敌之败也。是故胜兵先胜而后求战,败兵先战而后求胜。善用兵者,修道而保法,故能为胜败之政。   

    兵法:一曰度,二曰量,三曰数,四曰称,五曰胜。地生度,度生量,量生数,数生称,称生胜。故胜兵若以镒称铢,败兵若以铢称镒。胜者之战民也,若决积水于千仞之溪者,形也。

  5. 兵势篇

    孙子曰:凡治众如治寡,分数是也;斗众如斗寡,形名是也;三军之众,可使必受敌而无败者,奇正是也;兵之所加,如以碫投卵者,虚实是也。   

    凡战者,以正合,以奇胜。故善出奇者,无穷如天地,不竭如江海。终而复始,日月是也。死而更生,四时是也。声不过五,五声之变,不可胜听也;色不过五,五色之变,不可胜观也;味不过五,五味之变,不可胜尝也;战势不过奇正,奇正之变,不可胜穷也。奇正相生,如循环之无端,孰能穷之哉!   

    激水之疾,至于漂石者,势也;鸷鸟之疾,至于毁折者,节也。故善战者,其势险,其节短。势如扩弩,节如发机。纷纷纭纭,斗乱而不可乱;浑浑沌沌,形圆而不可败。乱生于治,怯生于勇,弱生于强。治乱,数也;勇怯,势也;强弱,形也。   

    故善动敌者,形之,敌必从之;予之,敌必取之。以利动之,以卒待之。故善战者,求之于势,不责于人故能择人而任势。任势者,其战人也,如转木石。木石之性,安则静,危则动,方则止,圆则行。   

    故善战人之势,如转圆石于千仞之山者,势也。

  6. 虚实篇

    孙子曰:凡先处战地而待敌者佚,后处战地而趋战者劳,故善战者,致人而不致于人。能使敌人自至者,利之也;能使敌人不得至者,害之也,故敌佚能劳之,饱能饥之,安能动之。出其所不趋,趋其所不意。行千里而不劳者,行于无人之地也。   

    攻而必取者,攻其所不守也;守而必固者,守其所不攻也。故善攻者,敌不知其所守;善守者,敌不知其所攻。微乎微乎,至于无形。神乎神乎,至于无声,故能为敌之司命。进而不可御者,冲其虚也;退而不可追者,速而不可及也。故我欲战,敌虽高垒深沟,不得不与我战者,攻其所必救也;我不欲战,画地而守之,敌不得与我战者,乖其所之也。   

    故形人而我无形,则我专而敌分。我专为一,敌分为十,是以十攻其一也,则我众而敌寡;能以众击寡者,则吾之所与战者,约矣。吾所与战之地不可知,不可知,则敌所备者多;敌所备者多,则吾所与战者,寡矣。   

    故备前则后寡,备后则前寡,备左则右寡,备右则左寡,无所不备,则无所不寡。寡者,备人者也;众者,使人备己者也。   

    故知战之地,知战之日,则可千里而会战。不知战地,不知战日,则左不能救右,右不能救左,前不能救后,后不能救前,而况远者数十里,近者数里乎?   

    以吾度之,越人之兵虽多,亦奚益于胜败哉?故曰:胜可为也。敌虽众,可使无斗。故策之而知得失之计,作之而知动静之理,形之而知死生之地,角之而知有余不足之处。故形兵之极,至于无形。无形,则深间不能窥,智者不能谋。因形而错胜于众,众不能知;人皆知我所以胜之形,而莫知吾所以制胜之形。故其战胜不复,而应形于无穷。   

    夫兵形象水,水之形,避高而趋下,兵之形,避实而击虚。水因地而制流,兵因敌而制胜。故兵无常势,水无常形,能因敌变化而取胜者,谓之神。   

    故五行无常胜,四时无常位,日有短长,月有死生。

  7. 军争篇

    孙子曰:凡用兵之法,将受命于君,合军聚众,交和而舍,莫难于军争。军争之难者,以迂为直,以患为利。   

    故迂其途,而诱之以利,后人发,先人至,此知迂直之计者也。军争为利,军争为危。举军而争利则不及,委军而争利则辎重捐。是故卷甲而趋,日夜不处,倍道兼行,百里而争利,则擒三将军,劲者先,疲者后,其法十一而至;五十里而争利,则蹶上将军,其法半至;三十里而争利,则三分之二至。是故军无辎重则亡,无粮食则亡,无委积则亡。故不知诸侯之谋者,不能豫交;不知山林、险阻、沮泽之形者,不能行军;不用乡导者,不能得地利。故兵以诈立,以利动,以分和为变者也。故其疾如风,其徐如林,侵掠如火,不动如山,难知如阴,动如雷震。掠乡分众,廓地分利,悬权而动。先知迂直之计者胜,此军争之法也。    

    《军政》曰:“言不相闻,故为之金鼓;视不相见,故为之旌旗。”夫金鼓旌旗者,所以一民之耳目也。民既专一,则勇者不得独进,怯者不得独退,此用众之法也。故夜战多金鼓,昼战多旌旗,所以变人之耳目也。    

    三军可夺气,将军可夺心。是故朝气锐,昼气惰,暮气归。善用兵者,避其锐气,击其惰归,此治气者也。以治待乱,以静待哗,此治心者也。以近待远,以佚待劳,以饱待饥,此治力者也。无邀正正之旗,勿击堂堂之阵,此治变者也。   

    故用兵之法,高陵勿向,背丘勿逆,佯北勿从,锐卒勿攻,饵兵勿食,归师勿遏,围师遗阙,穷寇勿迫,此用兵之法也。

  8. 九变篇

    孙子曰:凡用兵之法,将受命于君,合军聚众。圮地无舍,衢地交合,绝地无留,围地则谋,死地则战,途有所不由,军有所不击,城有所不攻,地有所不争,君命有所不受。    

    故将通于九变之利者,知用兵矣;将不通九变之利,虽知地形,不能得地之利矣;治兵不知九变之术,虽知五利,不能得人之用矣。   

    是故智者之虑,必杂于利害,杂于利而务可信也,杂于害而患可解也。是故屈诸侯者以害,役诸侯者以业,趋诸侯者以利。故用兵之法,无恃其不来,恃吾有以待之;无恃其不攻,恃吾有所不可攻也。   

    故将有五危,必死可杀,必生可虏,忿速可侮,廉洁可辱,爱民可烦。凡此五者,将之过也,用兵之灾也。覆军杀将,必以五危,不可不察也。

  9. 行军篇

    孙子曰:凡处军相敌:绝山依谷,视生处高,战隆无登,此处山之军也。绝水必远水;客绝水而来,勿迎之于水内,令半济而击之,利;欲战者,无附于水而迎客;视生处高,无迎水流,此处水上之军也。绝斥泽,惟亟去无留;若交军于斥泽之中,必依水草而背众树,此处斥泽之军也。平陆处易,而右背高,前死后生,此处平陆之军也。凡此四军之利,黄帝之所以胜四帝也。   

    凡军好高而恶下,贵阳而贱阴,养生而处实,军无百疾,是谓必胜。丘陵堤防,必处其阳,而右背之。此兵之利,地之助也。   

    上雨,水沫至,欲涉者,待其定也。   

    凡地有绝涧、天井、天牢、天罗、天陷、天隙,必亟去之,勿近也。吾远之,敌近之;吾迎之,敌背之。   

    军行有险阻、潢井、葭苇、山林、蘙荟者,必谨覆索之,此伏奸之所处也。   

    敌近而静者,恃其险也;远而挑战者,欲人之进也;其所居易者,利也。   

    众树动者,来也;众草多障者,疑也;鸟起者,伏也;兽骇者,覆也;尘高而锐者,车来也;卑而广者,徒来也;散而条达者,樵采也;少而往来者,营军也。   

    辞卑而益备者,进也;辞强而进驱者,退也;轻车先出居其侧者,陈也;无约而请和者,谋也;奔走而陈兵车者,期也;半进半退者,诱也。   

    杖而立者,饥也;汲而先饮者,渴也;见利而不进者,劳也;鸟集者,虚也;夜呼者,恐也;军扰者,将不重也;旌旗动者,乱也;吏怒者,倦也;粟马肉食,军无悬缻,不返其舍者,穷寇也;谆谆翕翕,徐与人言者,失众也;数赏者,窘也;数罚者,困也;先暴而后畏其众者,不精之至也;来委谢者,欲休息也。兵怒而相迎,久而不合,又不相去,必谨察之。   

    兵非益多也,惟无武进,足以并力、料敌、取人而已。夫惟无虑而易敌者,必擒于人。   

    卒未亲附而罚之,则不服,不服则难用也。卒已亲附而罚不行,则不可用也。故令之以文,齐之以武,是谓必取。令素行以教其民,则民服;令不素行以教其民,则民不服。令素行者,与众相得也。

  10. 地形篇

    孙子曰:地形有通者,有挂者,有支者,有隘者,有险者,有远者。我可以往,彼可以来,曰通;通形者,先居高阳,利粮道,以战则利。可以往,难以返,曰挂;挂形者,敌无备,出而胜之;敌若有备,出而不胜,难以返,不利。我出而不利,彼出而不利,曰支;支形者,敌虽利我,我无出也;引而去之,令敌半出而击之,利。隘形者,我先居之,必盈之以待敌;若敌先居之,盈而勿从,不盈而从之。险形者,我先居之,必居高阳以待敌;若敌先居之,引而去之,勿从也。远形者,势均,难以挑战,战而不利。凡此六者,地之道也;将之至任,不可不察也。   

    故兵有走者,有弛者,有陷者,有崩者,有乱者,有北者。凡此六者,非天之灾,将之过也。夫势均,以一击十,曰走;卒强吏弱,曰弛,吏强卒弱,曰陷;大吏怒而不服,遇敌怼而自战,将不知其能,曰崩;将弱不严,教道不明,吏卒无常,陈兵纵横,曰乱;将不能料敌,以少合众,以弱击强,兵无选锋,曰北。凡此六者,败之道也;将之至任,不可不察也。   

    夫地形者,兵之助也。料敌制胜,计险厄远近,上将之道也。知此而用战者必胜,不知此而用战者必败。   

    故战道必胜,主曰无战,必战可也;战道不胜,主曰必战,无战可也。故进不求名,退不避罪,唯人是保,而利合于主,国之宝也。   

    视卒如婴儿,故可与之赴深溪;视卒如爱子,故可与之俱死。厚而不能使,爱而不能令,乱而不能治,譬若骄子,不可用也。   

    知吾卒之可以击,而不知敌之不可击,胜之半也;知敌之可击,而不知吾卒之不可以击,胜之半也;知敌之可击,知吾卒之可以击,而不知地形之不可以战,胜之半也。故知兵者,动而不迷,举而不穷。故曰:知彼知己,胜乃不殆;知天知地,胜乃不穷。

  11. 九地篇

    孙子曰:用兵之法,有散地,有轻地,有争地,有交地,有衢地,有重地,有圮地,有围地,有死地。诸侯自战其地,为散地。入人之地不深者,为轻地。我得则利,彼得亦利者,为争地。我可以往,彼可以来者,为交地。诸侯之地三属,先至而得天下之众者,为衢地。入人之地深,背城邑多者,为重地。行山林、险阻、沮泽,凡难行之道者,为圮地。所由入者隘,所从归者迂,彼寡可以击吾之众者,为围地。疾战则存,不疾战则亡者,为死地。是故散地则无战,轻地则无止,争地则无攻,交地则无绝,衢地则合交,重地则掠,圮地则行,围地则谋,死地则战。   

    所谓古之善用兵者,能使敌人前后不相及,众寡不相恃,贵贱不相救,上下不相收,卒离而不集,兵合而不齐。合于利而动,不合于利而止。敢问:“敌众整而将来,待之若何?”曰:“先夺其所爱,则听矣。”   

    兵之情主速,乘人之不及,由不虞之道,攻其所不戒也。   

    凡为客之道:深入则专,主人不克;掠于饶野,三军足食;谨养而勿劳,并气积力,运兵计谋,为不可测。投之无所往,死且不北,死焉不得,士人尽力。兵士甚陷则不惧,无所往则固。深入则拘,不得已则斗。是故其兵不修而戒,不求而得,不约而亲,不令而信,禁祥去疑,至死无所之。吾士无余财,非恶货也;无余命,非恶寿也。令发之日,士卒坐者涕沾襟。偃卧者涕交颐。投之无所往者,诸、刿之勇也。   

    故善用兵者,譬如率然;率然者,常山之蛇也。击其首则尾至,击其尾则首至,击其中则首尾俱至。敢问:“兵可使如率然乎?”曰:“可。”夫吴人与越人相恶也,当其同舟而济,遇风,其相救也如左右手。是故方马埋轮,未足恃也;齐勇若一,政之道也;刚柔皆得,地之理也。故善用兵者,携手若使一人,不得已也。   

    将军之事:静以幽,正以治。能愚士卒之耳目,使之无知。易其事,革其谋,使人无识;易其居,迂其途,使人不得虑。帅与之期,如登高而去其梯;帅与之深入诸侯之地,而发其机,焚舟破釜,若驱群羊,驱而往,驱而来,莫知所之。聚三军之众,投之于险,此谓将军之事也。九地之变,屈伸之利,人情之理,不可不察。   

    凡为客之道:深则专,浅则散。去国越境而师者,绝地也;四达者,衢地也;入深者,重地也;入浅者,轻地也;背固前隘者,围地也;无所往者,死地也。   

    是故散地,吾将一其志;轻地,吾将使之属;争地,吾将趋其后;交地,吾将谨其守;衢地,吾将固其结;重地,吾将继其食;圮地,吾将进其涂;围地,“吾将塞其阙;死地,吾将示之以不活。   

    故兵之情,围则御,不得已则斗,过则从。是故不知诸侯之谋者,不能预交;不知山林、险阻、沮泽之形者,不能行军;不用乡导者,不能得地利。四五者,不知一,非霸王之兵也。夫霸王之兵,伐大国,则其众不得聚;威加于敌,则其交不得合。是故不争天下之交,不养天下之权,信己之私,威加于敌,故其城可拔,其国可隳。施无法之赏,悬无政之令,犯三军之众,若使一人。犯之以事,勿告以言;犯之以利,勿告以害。   

    投之亡地然后存,陷之死地然后生。夫众陷于害,然后能为胜败。   

    故为兵之事,在于顺详敌之意,并敌一向,千里杀将,此谓巧能成事者也。   

    是故政举之日,夷关折符,无通其使;厉于廊庙之上,以诛其事。敌人开阖,必亟入之。先其所爱,微与之期。践墨随敌,以决战事。是故始如处女,敌人开户,后如脱兔,敌不及拒。

  12. 火攻篇

    孙子曰:凡火攻有五:一曰火人,二曰火积,三曰火辎,四曰火库,五曰火队。行火必有因,烟火必素具。发火有时,起火有日。时者,天之燥也;日者,月在箕、壁、翼、轸也。凡此四宿者,风起之日也。   

    凡火攻,必因五火之变而应之。火发于内,则早应之于外。火发兵静者,待而勿攻,极其火力,可从而从之,不可从而止。火可发于外,无待于内,以时发之。火发上风,无攻下风。昼风久,夜风止。凡军必知有五火之变,以数守之。   

    故以火佐攻者明,以水佐攻者强。水可以绝,不可以夺。夫战胜攻取,而不修其功者凶,命曰费留。故曰:明主虑之,良将修之。非利不动,非得不用,非危不战。主不可以怒而兴师,将不可以愠而致战;合于利而动,不合于利而止。怒可以复喜,愠可以复悦;亡国不可以复存,死者不可以复生。故明君慎之,良将警之,此安国全军之道也。

  13. 用间篇

    孙子曰:凡兴师十万,出征千里,百姓之费,公家之奉,日费千金;内外骚动,怠于道路,不得操事者,七十万家。相守数年,以争一日之胜,而爱爵禄百金,不知敌之情者,不仁之至也,非人之将也,非主之佐也,非胜之主也。故明君贤将,所以动而胜人,成功出于众者,先知也。先知者,不可取于鬼神,不可象于事,不可验于度,必取于人,知敌之情者也。   

    故用间有五:有因间,有内间,有反间,有死间,有生间。五间俱起,莫知其道,是谓神纪,人君之宝也。因间者,因其乡人而用之。内间者,因其官人而用之。反间者,因其敌间而用之。死间者,为诳事于外,令吾间知之,而传于敌间也。生间者,反报也。   

    故三军之事,莫亲于间,赏莫厚于间,事莫密于间。非圣智不能用间,非仁义不能使间,非微妙不能得间之实。微哉!微哉!无所不用间也。间事未发,而先闻者,间与所告者皆死。(莫亲于间:指没有比间谍更应成为亲信了。赏莫厚于间:指没有比间谍更应该得到丰富的奖赏了。事莫密于间:没有经间谍的事更应该保守机密了。间事未发:用间之事还没有开始进行。间与所告者皆死:间谍和告知用间之事的人都要处死。)   

    凡军之所欲击,城之所欲攻,人之所欲杀,必先知其守将,左右,谒者,门者,舍人之姓名,令吾间必索知之。   

    必索敌人之间来间我者,因而利之,导而舍之,故反间可得而用也。因是而知之,故乡间、内间可得而使也;因是而知之,故死间为诳事,可使告敌。因是而知之,故生间可使如期。五间之事,主必知之,知之必在于反间,故反间不可不厚也。   

    昔殷之兴也,伊挚在夏;周之兴也,吕牙在殷。故惟明君贤将,能以上智为间者,必成大功。此兵之要,三军之所恃而动也。


七月三十日等待变化等待机会

我最欣赏的出身于物理学家的经济学家陈平老先生的论文《代谢增长论》我看提要就看不懂,非常的深奥专业,保存一个拷贝这里
弯弯的电视台推崇的所谓经济学家向松祚我也看了他去年的视频,他狂言中国经济指标已经是负的了,我对于统计数字抱有怀疑但是也没有可能吹牛吹出几万亿的增长吧?他的某些观点虽说不错但是耸人听闻的意味大于问题解决方案的提出,我对于此人的智商的怀疑小于他的其心可诛的恶意的估计,总之,此人绝不可用也。其实他所说的一些问题都是存在的,但是很多的时候忽略了问题的关键确实作为学者的恶毒之处,对于普通人来说完全没有量化概念的情况下给出所谓的“有毒食品”而不给出剂量算是耍流氓,那么对于有能力知晓各个因素的量化数据而依然用耸人听闻的方式来推送自己的偏激观点就让人感觉居心叵测了,这就是我认为其心可诛的地方。中国深层次的问题的确是有地方债的问题,而且不是一般的严重,可是从发展的观点来看却是如同成长的烦恼一样的绕不开,正所谓是必须的,比如全国各地到处修建地铁举债有错吗?我以为没有错,这样的债务首先当地市民是直接的使用者,是缓解交通的唯一的解决办法,是长期助推经济的大目标,只要地方政府自己能够筹集到资金不妨放开了搞,当然有些太小的城市硬要上发改委设定门槛是对的,总之这种债务是良性的,就如同高铁电力一样要先行,单纯的从短期收益的资本投资来看举债规模是低了一个层级的弱视,如果作为西方投行培养出的洋买办看不到长远是可以原谅的。西方的有些经济媒体是目光锐利的看到了贸易战反而掩盖了中国经济深层次的矛盾,可是反过来看难道美国不是用一些闪光的政治议题来掩盖其破产的经济顽疾吗?两个党都对于国家财政的破产悄无声息的达成了妥协闷声大印钞票度日了。谁比谁傻多少?
昨天的问题是我居然不是很明确在函数声明里的throw (xxx_exceptio)会被编译器优化以至于只有这种exception才能被catch,这个是合乎逻辑的否则要这个声明何用?这就是导致我之前一直以为我在stack临时声明的异常用throw抛出导致一些crash的疑惑,因为没法catch了所以就core dump了并不是什么segment fault类的严重罢了。我对于内存地址错误的恐惧太深了,相对来说当我明白是unhandled的exception就感觉轻松多了。
对于香港的动乱我以为不妨让子弹再飞一会儿,一方面能让反对派彻底暴露以便清楚流毒,另一方面让香港年轻人多张狂一些也有好处,给香港最后的经济支柱降降温有好处,客观上香港的没落对于大陆深圳上海的发展有大好处,同时香港的这个烫山芋还是不要接手让他自己烂掉比较好。
我看到一个很好的关于内存的系列文章。我手工排版补全了一个uma vs numa的图片,算是我的贡献吧,不过我很遗憾找不到原文的图片恐怕不能符合原文的意思。
这里是我收集的网络上评定的100部最好的科幻电影,我对于排名第一还是同意的。我去除了其中的广告算是我的贡献吧。
  1. 100部最好的科幻电影之一

  2. 100部最好的科幻电影之二

  3. 100部最好的科幻电影之三

  4. 100部最好的科幻电影之四


八月一日等待变化等待机会

继续拷贝内存方面的专家Frank Denneman的其他文章,大部分根本看不懂,只不过这个关于AMD Opteron6100是和我的服务器相近的型号我才感兴趣而已。
对于系统配置的脚本inxi我不知道是否是平台通用的,保存ubuntu的看看其他平台是否可用。
这里是关于vsphere storage的文章,对于vsphere我几乎一无所知,因为始终都忙于服务器,对于其他方面应该抽时间补课啊。

The 2016 NUMA Deep Dive Series:
Part 0: Introduction NUMA Deep Dive Series
Part 1: From UMA to NUMA
Part 2: System Architecture
Part 3: Cache Coherency
Part 4: Local Memory Optimization
Part 5: ESXi VMkernel NUMA Constructs
Part 6: NUMA Initial Placement and Load Balancing Operations
Part 7: From NUMA to UMA


八月三日等待变化等待机会

这个据说是阿里巴巴的面试题,就是假定sqrt(2)的近似值是1.414那么不适用任何数学库计算其精确值到小数点后10位。这个当然是一个小小的锻炼脑筋的早操,我觉得我还是要依靠调试器来认识我的算法,就是说我只有在持续增加的时候到达拐点在首次减少的时候才增加小数一位重新开始流程。我最后的结果是1.4142135630
//============================================================================
// Name        : numberTest.cpp
// Author      : 
#include <iostream>
using namespace std;
int main()
{
	unsigned int result = 414;
	int number = 3;
	bool bPreviousSmaller = (1.0+result/1000.0)*(1.0+result/1000.0) < 2.0;
	while (number < 10)
	{
		double value = 1;
		for (int i = 0; i < number; i ++)
		{
			value*=10;
		}
		double temp = (1.0+result/value)*(1.0+result/value);
		cout << "value:"<<value<< "\tnumber:"<<number << "\tresult:" <<  result << "\ttemp:" << temp<< endl;
		if (temp < 2.0)
		{
			bPreviousSmaller = true;
			result ++;
		}
		else
		{
			if (bPreviousSmaller)
			{
				bPreviousSmaller = false;
				number ++;
				result *=10;
			}
			else
			{
				result--;
			}
		}
	}
	cout << result << endl;
	return 0;
}
香港的动乱如何收场呢?看来这些人是所谓的不流血决不罢休了。应该使用何种的震慑手段来修理这些小屁孩呢?

这个是我的命令行:
LANG=zh_CN.utf8 ffmpeg -y -ss 00:41:55 -i Kung.Fu.Hustle.2004.iNTERNAL.DVDRip.XviD.Dual-Audio-COC.CD2.avi -t 00:00:12 -vf "drawtext=fontcolor=green:fontsize=36:shadowy=4:\x='if(gte(t,1), (main_w-mod(t*60,main_w)), NAN)':y=(main_h-line_h-10):text='没有雷霆手段怎显我菩萨心肠!'" /tmp/mercy.m4v

八月七日等待变化等待机会

关于regex_match的submatch这个是一个很基本的例子,它来自于工作中的一个小小的部分,我需要把内存条的所谓的debpath,就是设备的位置信息重整一下,原因是snmp得到的格式和通过其他api得到的不同,比如一个内存条处于cpu 0的channel F的dimm 1上,那么人们容易读的信息是这样子的:"( CPU_0 ) / ( Channel_F ) / ( Channel_F Dimm_1 )",这个格式当然是很随意的,而snmp给出了一个缩略的写法CPU0_F1,那么我就用regex_match去抓取这三个变量:注意我用了四个挂号来抓取四个部分,当然有两个是重复的,不过意思就是说不要被literal的()所迷惑,regex的保留的挂号决定了submatch的个数,所以连同完全的match总共是5个,也就是说what.size()是5,而所谓的submatch的first/second是两个指针指向起始和终止

#include <iostream>
#include <boost/regex.hpp>
using namespace std;
int main()
{
        string strInput = "( CPU_0 ) / ( Channel_F ) / ( Channel_F Dimm_1 )";
        boost::regex expression("\\( CPU_([0-9]) \\) / \\( Channel_([A-Z]) \\) / \\( Channel_([A-Z]) Dimm_([0-9]) \\)");
        boost::cmatch what;
        string str;
        if (boost::regex_match(strInput.c_str(), what, expression))
        {
                str = "CPU";
                str += string(what[1].first, what[1].second);
                str += "_";
                str += string(what[2].first, what[2].second);
                str += string(what[4].first, what[4].second);
                cout << str >> endl;
        }
        return 0;
}

八月十日等待变化等待机会

关于gettid的问题是否和pthread::self()一样呢?我不知道需要做实验,应该是一样吧,不过我们在poco的库里的thread id看起来不是kernel级别的线程,我的理解是一个软件层次的线程,不过我不明白的是为什么不使用pthread的线程id呢?也许就是但是还没有找到代码。 关于mime对应的文件扩展名我找到这么两个源头:firefox的不完全的部分这里有个相当完整的列表。这个问题本来其实是无所谓的问题,因为只有windows才有严格的文件扩展名的绑定,linux压根不在乎,但是既然有file/libmagic这样的神奇工具可以鉴别文件属性何不在wget里做这个呢?这个就是我要做的,这个问题要分三步走,第一就是我要收集mime-ext的数据,我存了两个原始文件准备顺便练习一下regex来练手。

八月十四日等待变化等待机会

关于boost的regex的语法我找到这个官方的资料非常值得收藏

八月十八日等待变化等待机会

为了练习regex我花了好几天才解决这个问题。我收集了两个关于mime和文件扩展名的列表,分别是两个来源:mimeExt.txtextMime.txt。核心是如何设定正确的表达式。
#include <iostream>
#include <fstream>
#include <boost/regex.hpp>
#include <boost/filesystem.hpp>
#include <map>
#include <string>
using namespace std;
struct MimeExt
{
	string strDescription;
	string strMime;
	string strExt;
	operator string() const
	{
		return strDescription + ";" + strMime + ";" + strExt;
	}
};
typedef map<string, MimeExt> MimeExtMap;
int mimeExt(MimeExtMap& mapExt)
{
	ifstream in("/home/nick/workspace/magicTest/src/mimeExt.txt");
	if (in.is_open())
	{
		string str;
		while (std::getline(in, str))
		{
			boost::regex ex("\\s*([\\w|\\+|/|#|\\.|\\-]+(?:\\s+[\\w|\\+|#|\\-|/|\\.]+)*)"
					"\\s+([\\w][-|\\w]*/[\\w][\\.|\\+|\\-|\\w]*)"
					"\\s+([\\w|\\*]+)\\s*");
			boost::cmatch what;
			if (boost::regex_match(str.c_str(), what, ex, boost::match_default))
			{
				MimeExt mime;
				mime.strDescription = string(what[1].first, what[1].second);
				mime.strMime = string(what[2].first, what[2].second);
				mime.strExt = ".";
				mime.strExt += string(what[3].first, what[3].second);
				mapExt.insert(make_pair(mime.strMime, mime));
				MimeExtMap::iterator it = mapExt.find(mime.strMime);
				if (it == mapExt.end())
				{
					mapExt.insert(it, make_pair(mime.strMime, mime));
				}
			}
		}
		in.close();
	}
	return 0;
}
int extMime(MimeExtMap& mapExt)
{
	ifstream in("/home/nick/workspace/magicTest/src/extMime.txt");
	if (in.is_open())
	{
		string str;
		while (std::getline(in, str))
		{
			boost::regex ex("\\s*([\\w](?:[:|\\&|\\w|\\+|/|#|\\.|\\-|\\(|\\)|\\s|'])+[\\w|\\)|\\+])"
					"\\s+([\\w][-|\\w]*/[\\w][\\.|\\+|\\-|\\w|,]*)"
					"\\s*(N/A|[\\.][\\w|\\-]+)[,]?"
					"\\s+([\\(|\\)|\\&|/|\\-|\\+|'|\\.|\\:|\\w|\\s]*)");
			boost::cmatch what;
			if (boost::regex_match(str.c_str(), what, ex, boost::match_default))
			{
				MimeExt mime;
				mime.strDescription = string(what[1].first, what[1].second);
				mime.strMime = string(what[2].first, what[2].second);
				mime.strExt = string(what[3].first, what[3].second);
				MimeExtMap::iterator it = mapExt.find(mime.strMime);
				if (it == mapExt.end())
				{
					mapExt.insert(it, make_pair(mime.strMime, mime));
				}
			}
		}
		in.close();
	}
	return 0;
}
int main()
{
	MimeExtMap mapExt;
	mimeExt(mapExt);
	extMime(mapExt);
	for (MimeExtMap::const_iterator it = mapExt.begin(); it != mapExt.end(); it++)
	{
		cout << it->first << ":\t";
		cout << it->second.operator string() << endl;
	}
}

八月二十日等待变化等待机会

我始终都记不住这个名字:我使用的pdf的编辑器是Xournal。

八月二十三日等待变化等待机会

在使用libmagic的过程中发现了奇怪的问题,就是对于一个普通的java代码文件居然返回N/A,而我使用file来查看则返回正确的mimetype,我使用了strace看到似乎打开的magic文件不太对,比如file打开的是/usr/share/misc/magic.mgc,而我使用libmagic默认却是使用/usr/share/misc/magic.mime.mgc,当然结果出人意料没什么区别。

八月三十一日等待变化等待机会

不知道为什么Ubuntu18.04不再支持pdftk了,而是改在了snap,我不熟悉这个snap package的机制,结果使用中总是报出文件找不到的错误,为了救急只好使用qpdf,但是对于一个concat文件的功能我就要使用很麻烦的语法:qpdf --empty --pages 1.pdf 2.pdf 3.pdf -- output.pdf。而且我想要压缩文件也找不到正确的语法,也许不支持吧。后来看到一个更好的压缩的做法就是ps2pdf input.pdf output.pdf看来ps格式是很好的处理格式。

九月三日等待变化等待机会

我觉得最好的文档还是boost官方的文档因为有一些关于regex的细节是boost专有的。我保存了一个basic_syntax的版本这个部分其实问题不大,对于熟悉regex的没有多少意外。真的吗?我再读了一下觉得脸红,因为我并不算是一个熟悉regex的人士。在posix_basic里只有这个几个字符是有特殊含义的:
.[\*^$
也就是说连挂号都是literal,当然你要明确指出你要的是这个纯粹的basic,比如:

// e1 is a case sensitive POSIX-Basic expression:
boost::regex e1(my_expression, boost::regex::basic);
// e2 a case insensitive POSIX-Basic expression:
boost::regex e2(my_expression, boost::regex::basic|boost::regex::icase);
读到这里我觉得你就应该有能力辨别文档中的例子的疏忽,因为这些都是老程序员心不在焉写给初学者的,而且我估计大多数程序员压根儿想不出来有什么人会较真的使用posix的标准而不是使用人人熟悉的perl之类的扩展,所以,请你分辨一下这个例子有多么的害人:

An atom can also be repeated with a bounded repeat:

a\{n\} Matches 'a' repeated exactly n times.

a\{n,\} Matches 'a' repeated n or more times.

a\{n,m\} Matches 'a' repeated between n and m times inclusive.

For example:

^a{2,3}$请注意这里没有使用escape,但是不要误解不需要啊!
正确的应该是:^a\{2,3\}$

Will match either of:

aa
aaa

But neither of:

a
aaaa
以下的这个小测试居然耗费了我几乎大半天才想到是c++里我要处理好escape的escape。

#include <iostream>
#include <boost/regex.hpp>
using namespace std;
using namespace boost;
int main()
{
        string array[]={
                        "", "a", "aa", "aaa", "aaaa", "aaaaa"
        };

regex ex("^a\\{2,3\\}$", regex::basic|regex::icase);
cmatch what; for (size_t i = 0; i < sizeof(array)/sizeof(array[0]); i ++) { if (regex_match(array[i].c_str(), what, ex, match_partial)) { cout << what[0] << endl; } } return 0; }

九月四日等待变化等待机会

我觉得文档的例子还是要亲自试验一下。

int main()
{
        string array[]={ "abcd", "efg", "def", "ddef", "abcdef", "aabc", "abc", "ABC", "DEF"};
        regex ex("abc\ndef", regex::grep|regex::icase);//grep的意思本来就是basic|newline_alt
        cmatch what;
        for (size_t i = 0; i < sizeof(array)/sizeof(array[0]); i ++)
        {
                if (regex_match(array[i].c_str(), what, ex, match_default))
                {
                        cout << what[0] << endl;
                }
        }
        return 0;
}

九月五日等待变化等待机会

进一步是perl syntax这个多多少少有一些特别方面。首先,特殊字符集合扩大了:.[{}()\*+?|^$。这里增加了挂号和大挂号,就是说分组和长度,以及问号加号,就是长度的部分,还有就是分隔号作为“或”的部分。这里我要摘抄原文来加强记忆:
  1. .正常情况下除了NULL和newline字符外都可以。
  2. ^和$依旧是一行的开始和结束
  3. ()是所谓marked subsection,就是说可以使用\n来refer的。
  4. 相应的如果你不想产生marked subsection,你可以使用(?:)
  5. *重复任意次;+重复至少一次;?重复至多一次;{}可以指定重复次数或者重复次数范围:{n}是n次;{n,}至少n次;{n,m}重复次数在n到m之间;但是要注意{}必须是在符合重复次数时候才作为特殊字符,否则就是普通字符,这一点和[]是类似的,[不会单独成为特殊字符。
  6. 不贪心的重复,这一点很重要,因为正常情况下重复总是选择尽可能多的可能性来重复,如果在其后加了一个?的话就限制了重复的“贪心”:*?;+?;??;{n,}?;{n,m}?
一些非常简单的问题就是我居然不知道:
  1. 在ls文件名的时候如果其中有数字我想按照数字大小来排序,那么这么简单的问题不是使用-v吗?我一直以为这个是verbose的意思?当然不是,是natural sort,或者说version sort。
  2. 另一个通常的问题是我要输出文件名需要单引号:--quoting-style=shell。
  3. 而且我需要把这些文件写到一个文件里并且在每个文件名前面加上"filename"这样的格式:ls --quoting-style=shell -v 古畑任三郎\ 完结篇\ EP01\ 第一 晚\ 今、甦る死\ _SD_*_.flv | while read line; do echo "file $line" >> list.txt; done
  4. 我究竟为什么要这样做呢?因为我需要手动merge使用ffmpeg这些文件:ffmpeg -f concat -safe -1 -y -i list.txt -c copy -hide_banner 古畑任三郎\ 完结篇\ EP01\ 第一晚\ 今、甦る死\ _SD.flve
  5. 我为什么要手动merge这些文件呢?因为我使用ykdl来下载youku的视频,直接编译似乎有困难,我就是用官方的下载:
    • sudo apt-get install ffmpeg mpv python3-pip
    • pip3 install ykdl --upgrade --user
    • add ~/.local/bin to your PATH

九月六日等待变化等待机会

对于pocessive quantifier的概念我始终不太理解,官方的例子解释不清楚,我看到这个tutorial似乎更好一些。
我摘抄了一下希望作者不要介意,这个仅仅是供个人学习使用,如果有很多人受益应该也是作者的心愿。

Possessive Quantifiers

The topic on repetition operators or quantifiers explains the difference between greedy and lazy repetition. Greediness and laziness determine the order in which the regex engine tries the possible permutations of the regex pattern. A greedy quantifier first tries to repeat the token as many times as possible, and gradually gives up matches as the engine backtracks to find an overall match. A lazy quantifier first repeats the token as few times as required, and gradually expands the match as the engine backtracks through the regex to find an overall match.

Because greediness and laziness change the order in which permutations are tried, they can change the overall regex match. However, they do not change the fact that the regex engine will backtrack to try all possible permutations of the regular expression in case no match can be found.

Possessive quantifiers are a way to prevent the regex engine from trying all permutations. This is primarily useful for performance reasons. You can also use possessive quantifiers to eliminate certain matches.

Of the regex flavors discussed in this tutorial, possessive quantifiers are supported by JGsoft, Java, and PCRE. That includes languages with regex support based on PCRE such as PHP, Delphi, and R. Ruby supports possessive quantifiers starting with Ruby 1.9, Perl supports them starting with Perl 5.10, and Boost starting with Boost 1.42.

How Possessive Quantifiers Work

Like a greedy quantifier, a possessive quantifier repeats the token as many times as possible. Unlike a greedy quantifier, it does not give up matches as the engine backtracks. With a possessive quantifier, the deal is all or nothing. You can make a quantifier possessive by placing an extra + after it. * is greedy, *? is lazy, and *+ is possessive. ++, ?+ and {n,m}+ are all possessive as well.

Let's see what happens if we try to match "[^"]*+" against "abc". The " matches the ". [^"] matches a, b and c as it is repeated by the star. The final " then matches the final " and we found an overall match. In this case, the end result is the same, whether we use a greedy or possessive quantifier. There is a slight performance increase though, because the possessive quantifier doesn't have to remember any backtracking positions.

The performance increase can be significant in situations where the regex fails. If the subject is "abc (no closing quote), the above matching process happens in the same way, except that the second " fails. When using a possessive quantifier, there are no steps to backtrack to. The regular expression does not have any alternation or non-possessive quantifiers that can give up part of their match to try a different permutation of the regular expression. So the match attempt fails immediately when the second " fails.

Had we used "[^"]*" with a greedy quantifier instead, the engine would have backtracked. After the " failed at the end of the string, the [^"]* would give up one match, leaving it with ab. The " would then fail to match c. [^"]* backtracks to just a, and " fails to match b. Finally, [^"]* backtracks to match zero characters, and " fails a. Only at this point have all backtracking positions been exhausted, and does the engine give up the match attempt. Essentially, this regex performs as many needless steps as there are characters following the unmatched opening quote.

When Possessive Quantifiers Matter

The main practical benefit of possessive quantifiers is to speed up your regular expression. In particular, possessive quantifiers allow your regex to fail faster. In the above example, when the closing quote fails to match, we know the regular expression couldn't possibly have skipped over a quote. So there's no need to backtrack and check for the quote. We make the regex engine aware of this by making the quantifier possessive. In fact, some engines, including the JGsoft engine, detect that [^"]* and " are mutually exclusive when compiling your regular expression, and automatically make the star possessive.

Now, linear backtracking like a regex with a single quantifier does is pretty fast. It's unlikely you'll notice the speed difference. However, when you're nesting quantifiers, a possessive quantifier may save your day. Nesting quantifiers means that you have one or more repeated tokens inside a group, and the group is also repeated. That's when catastrophic backtracking often rears its ugly head. In such cases, you'll depend on possessive quantifiers and/or atomic grouping to save the day.

Possessive Quantifiers Can Change The Match Result

Using possessive quantifiers can change the result of a match attempt. Since no backtracking is done, and matches that would require a greedy quantifier to backtrack will not be found with a possessive quantifier. For example, ".*" matches "abc" in "abc"x, but ".*+" does not match this string at all.

In both regular expressions, the first " matches the first " in the string. The repeated dot then matches the remainder of the string abc"x. The second " then fails to match at the end of the string.

Now, the paths of the two regular expressions diverge. The possessive dot-star wants it all. No backtracking is done. Since the " failed, there are no permutations left to try, and the overall match attempt fails. The greedy dot-star, while initially grabbing everything, is willing to give back. It will backtrack one character at a time. Backtracking to abc", " fails to match x. Backtracking to abc, " matches ". An overall match "abc" is found.

Essentially, the lesson here is that when using possessive quantifiers, you need to make sure that whatever you're applying the possessive quantifier to should not be able to match what should follow it. The problem in the above example is that the dot also matches the closing quote. This prevents us from using a possessive quantifier. The negated character class in the previous section cannot match the closing quote, so we can make it possessive.

Using Atomic Grouping Instead of Possessive Quantifiers

Technically, possessive quantifiers are a notational convenience to place an atomic group around a single quantifier. All regex flavors that support possessive quantifiers also support atomic grouping. But not all regex flavors that support atomic grouping support possessive quantifiers. With those flavors, you can achieve the exact same results using an atomic group.

Basically, instead of X*+, write (?>X*). It is important to notice that both the quantified token X and the quantifier are inside the atomic group. Even if X is a group, you still need to put an extra atomic group around it to achieve the same effect. (?:a|b)*+ is equivalent to (?>(?:a|b)*) but not to (?>a|b)*. The latter is a valid regular expression, but it won't have the same effect when used as part of a larger regular expression.

To illustrate, (?:a|b)*+b and (?>(?:a|b)*)b both fail to match b. a|b matches the b. The star is satisfied, and the fact that it's possessive or the atomic group will cause the star to forget all its backtracking positions. The second b in the regex has nothing left to match, and the overall match attempt fails.

In the regex (?>a|b)*b, the atomic group forces the alternation to give up its backtracking positions. This means that if an a is matched, it won't come back to try b if the rest of the regex fails. Since the star is outside of the group, it is a normal, greedy star. When the second b fails, the greedy star backtracks to zero iterations. Then, the second b matches the b in the subject string.

我的理解是这个不是正确不正确的问题而是一个效率的问题,比如文中的这个例子根本不可能regex_match的,因为算法不高明不代表是错误,只是优化而已:这个根本就不对,我不明白当初我的试验是不是错了,因为从match的角度来看似乎完全不行,因为所谓的.是不包含"的,这个只有search存在lazy的问题,对于match根本就不成立。
int main()
{
        string str="\"abc\"x";
        regex ex("\".+\"");
        cmatch what;
        if (regex_match(str.c_str(), what, ex, match_partial))
        {
                cout << what[0] <<  endl;
        }
        return 0;
}
这里再次摘抄boost regex character class

Character Classes that are Always Supported

The following character class names are always supported by Boost.Regex:

Name POSIX-standard name Description
alnum Yes Any alpha-numeric character.
alpha Yes Any alphabetic character.
blank Yes Any whitespace character that is not a line separator.
cntrl Yes Any control character.
d No Any decimal digit
digit Yes Any decimal digit.
graph Yes Any graphical character.
l No Any lower case character.
lower Yes Any lower case character.
print Yes Any printable character.
punct Yes Any punctuation character.
s No Any whitespace character.
space Yes Any whitespace character.
unicode No Any extended character whose code point is above 255 in value.
u No Any upper case character.
upper Yes Any upper case character.
w No Any word character (alphanumeric characters plus the underscore).
word No Any word character (alphanumeric characters plus the underscore).
xdigit Yes Any hexadecimal digit character.

Character classes that are supported by Unicode Regular Expressions

The following character classes are only supported by Unicode Regular Expressions: that is those that use the u32regex type.  The names used are the same as those from Chapter 4 of the Unicode standard.

Short Name Long Name
ASCII
Any
Assigned
C* Other
Cc Control
Cf Format
Cn Not Assigned
Co Private Use
Cs Surrogate
L* Letter
Ll Lowercase Letter
Lm Modifier Letter
Lo Other Letter
Lt Titlecase
Lu Uppercase Letter
M* Mark
Mc Spacing Combining Mark
Me Enclosing Mark
Mn Non-Spacing Mark
N* Number
Nd Decimal Digit Number
Nl Letter Number
No Other Number
P* Punctuation
Pc Connector Punctuation
Pd Dash Punctuation
Pe Close Punctuation
Pf Final Punctuation
Pi Initial Punctuation
Po Other Punctuation
Ps Open Punctuation
S* Symbol
Sc Currency Symbol
Sk Modifier Symbol
Sm Math Symbol
So Other Symbol
Z* Separator
Zl Line Separator
Zp Paragraph Separator
Zs Space Separator

九月十四日等待变化等待机会

我下载了不少的youtube的视频,可是其中有些剧集的视频格式不一,我用一个简单的循环来做ffmpeg的转换,可是始终出现这样一个错误,google才找到了这个原因,要加一个“< /dev/null
ls *.mkv | while read line; do ffmpeg -y -i "$line" "${line%%*(.mkv)}.mp4" < /dev/null ; done
更加的详细解释在这里:while read line; do ffmpeg -i $line ${line%%*(.mkv)}.mp4 < /dev/null ; done < <(find . -name '*.mkv')
其中<(find . -name '*.mkv')是代表了一个输入文件,这里<很重要代表了后面跟的是文件。你把(find . -name '*.mkv')换成文件名也可以。对于<则是普通的重定向。

九月二十四日等待变化等待机会

笔记是个好东西,我再一次忘记了怎么获取mp3音轨的办法,这个是我的笔记。再次总结一下:
  1. 首先安装必要的库:
    sudo aptitude install libx264-dev libsdl1.2-dev libv4l-dev libass-dev libbluray-dev libmp3blame-dev libopencore-amrnb-dev libopencore-amrwb-dev libopenjp2-7-dev libtheora-dev libxvidcore-dev
  2. 这里的configure很重要,不但需要mp3的库,我更想避免安装我自己编译的版本防止和官方版本冲突,所以,我使用了静态编译不需要安装:
    ./configure --enable-gpl --enable-version3 --enable-nonfree --enable-postproc  --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libtheora --enable-libvorbis --enable-libx264 --enable-libxvid --disable-shared --enable-static --enable-libass --enable-libbluray --enable-libopenjpeg   --enable-libfreetype --enable-libfontconfig --enable-gmp --enable-librsvg --enable-libtesseract --enable-ffmpeg --enable-ffplay --enable-ffprobe
  3. 这个小循环:
    for var in `ls *.m4v`; do name=`echo ${var}|grep -Eoi "^.*[^.]{5}"`; ~/Downloads/ffmpeg-git/ffmpeg -i ./${var} -c:a:0 libmp3lame -b 48000 -vn -sn mp3/${name}.mp3 ; done

九月二十五日等待变化等待机会

我的笔记本装有amd.gpu,而屡屡遭遇死机,是彻底的desktop frozen,并不是内核的crash,应该是desktop或者显卡的问题,因为我学乖了,使用ssh是可以登录的,看到内核文档里有amd gpu recovery被disable了,所以,google说在启动命令行加上amdgpu.gpu_recovery=1。不过我觉得我的问题是这个。小插曲是dmesg的时间戳可以这样看dmesg -T。

十月八日等待变化等待机会

多日荒废时间。今天终于做了一点正经事。首先,我意识到libmagic本身支持的mime我没有完全收集,所以使用magic_list先dump看看所有的mimelist。其间深刻体会到我的笔记有多么的重要,这里的关于regex的笔记非常的好!比如boost::regex ex("\\[(.*)\\]");其中的.*因为是在字符集内所以是特殊字符了,为什么?因为\\[\\]是普通字符啊。
另一个我想尝试而没有时间的是资源文件嵌入可执行文件的问题,看上去虽然简单但是始终没有时间实验。对于这个做法我还是心存疑虑,主要的想法是我在想对于动态库能否使用data section呢?

extern uint8_t data[]     asm("_binary_foo_bar_start");
extern uint8_t data_end[] asm("_binary_foo_bar_end");

十月十九日等待变化等待机会

在使用youtube-dl中遇到文件名太长而无法写的错误,一直找不到办法,使用start-list跳过也不行,后来使用这个--restrict-filenames似乎是可以。另一个对于参数--continue是一个误解,我遇到timeout的错误需要重来,google发现有人给出这个实在是害死人,这个是resume还是不的选项,相反的是--no-continue,我被害死了。应该是使用-R infinite这个是retry的次数,另一个是--fragment-retries infinite这个也许也有用。

十月二十一日等待变化等待机会

google了youtube-dl的timeout问题发现似乎是一个比较难以解决的问题,别人提供的变通的重试的本法是这样子的
for i in {1..20}; do youtube-dl -l https://www.youtube.com/watch?v=pALJAT4pkIo&list=PLYG8vFcMYIaKWEC-uJ2DdHDToQjtksFxV && break || sleep 15; done

十月二十四日等待变化等待机会

这是谁写的代码?我对此敬仰不已啊,难道是我自己写的吗?我怎么一点印象都没有呢?我的智商与记忆与日俱减根本无法想象我以前怎么写的这些? 早上胡思乱想看到boost的parser还能这样使用让我惊讶不已。我的同事是一个很优秀的QA最近问我c++和c语言相比如何,我只有老老实实跟她说这几乎是两种不同的语言,基本上前者是后者的超集,一般来说使用c++的程序员一般能够看懂c程序,而反过来不行。而c++11的c11和c++98又是另一次飞跃,我承认我对于boost/c++11有很大的困惑,可以说是从头再学习。
//============================================================================
// Name        : htmlTest.cpp
// Author      : 
// Version     :
// Copyright   : Your copyright notice
// Description : Hello World in C++, Ansi-style
//============================================================================

#include <iostream>
#include <fstream>
#include <boost/regex.hpp>
using namespace std;
using namespace boost;
int main()
{
	ifstream in("/BigDisk/diabloforum/public_html/2019.htm");
	ofstream out("/tmp/2019.htm");
	if (in.is_open() && out.is_open())
	{
		regex ex("<hr><p  style=\"color:red;\">([\\W]+月[\\W]+日)等待变化等待机会</p>");
		cmatch what;
		string str;
		char buffer[20480];
		while (in.getline(buffer, 20480))
		{
			if (regex_match(buffer, what, ex))
			{
				cout << "****************************"<<buffer <<"**************************" << endl;
				string strDate = string(what[1].first, what[1].second);
				cout << strDate << endl;
				string strNew = "<a id=\"";
				strNew += strDate;
				strNew += "\">";
				strNew+= buffer;
				strNew+="</a>";
				cout << strNew << endl;
				out << strNew << endl;
			}
			else
			{
				out << buffer << endl;
			}
		}
	}
	return 0;
}
这里几乎是婴儿学说话的hellworld,我仅仅是想解决一个简单的问题就是给我的日记加上标记,就是每个entry都可以用日期来reference,使用htmlparser实在是overkill,何况也没有趁手的家伙就近使用。那么有什么问题呢?原因是我的文件我一直以为编码中文就要使用宽字符的wregex配合L"blabal"的宽字符表达式,那么wcmatch与此相当,regex_match有没有宽字符的版本呢?我疑惑了。没有!但是你输入字串也必须是wstring,于是wifstream也必须是,那么getline有没有宽字符的版本呢?好像没有。但是wcout需要宽字符版本,这个实在是无语。可是结果我发现我的getline就失败了才意识到我的编码是utf8其实根本就是一个伪命题,所以以上统统不对。那么表达式究竟怎样克服中文的问题呢?比如汉字“一”是几个字符呢?所以最后你看到了我偷懒了,管它几个呢,我都用[\\W]+来代替。如此一个牙牙学语的问题我花了一两个小时折腾。

十月三十一日等待变化等待机会

关于youtube-dl下载中遇到的“Requested formats are incompatible for merge and will be merged into mkv”是很容易预想到的。我打算尝试这样一个套路:
for i in {1..100}; do youtube-dl --force-ipv4 -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]' -i -l {PlayListUrl} && break || sleep 20; done
终于使用tidy检查目前的日记到没有错误和警告
./tidy --warn-proprietary-attributes no --fix-uri no -utf8 --indent-with-tabs no -f /tmp/error.txt /BigDisk/diabloforum/public_html/2019.htm
其中--fix-uri设定为no是因为我使用中文做uri,这个似乎是在html5里允许的,但是tidy要兼容以前的要求所以只能使用选项

十一月四日等待变化等待机会

对于boost::regex识别中文的盲目试验终于看到了一点点的曙光,肯定是普通的regex不能很好的识别中文,哪怕是utf8的中文,依然的他们不属于任何的.*,在寻求google无望之后唯一的光明就是大师的RTFC,下载boost源代码从看sample code学起。

十一月九日等待变化等待机会

我现在的记忆就是这样子,之前的这个regex的问题我干干净净的忘记了,如果不是看到上一次的记录我完全想不起这件事情了。昨天在公司试验了一下Intel所谓的Optane的persistent memory其中唯一值得指出的是对于一个char device你要像文件一样打开的话你不能使用普通的stat的方式来获得它的大小,因为他不是文件,那么怎样使用mmap来操作呢?实际上我是从ndctl的reference的一个fio的测试的代码看到的,就是说这个devdax的设备的大小操作系统是知晓的,所以在/sys/devices下有所有的设备,而有些设备是有这个size的属性的:
nick@nick-HP-Laptop-AMD:/sys/devices$ cat ./pci0000:00/0000:00:01.5/0000:04:00.0/nvme/nvme0/nvme0n1/nvme0n1p5/size | numfmt --to=iec
746M
对于fsdax/raw/sector的模式他们都是文件所以你可以创建文件
[root@centos76-nick run]# dd if=/dev/zero of=bigfile.bin bs=1M count=100K 
102400+0 records in
102400+0 records out
107374182400 bytes (107 GB) copied, 36.6604 s, 2.9 GB/s
[root@centos76-nick run]# 
我的试验是mmap实际上不管什么文件系统差别不是很大,在tmpfs/devdax/fsdax上我看性能依次下降不到10%,所以从这里我想象不出Intel的所宣称的启动inmemory database从37分钟到几十秒是怎么做到的?也许我的测试程序里没有包括msync的时间。今天加上了msync似乎没有变化

十一月十三日等待变化等待机会

我的笔记本的amdgpu导致的死机问题让我抓狂。这里先把hp的官方的关于硬件信息记录如下
这里我准备尝试更新驱动,也许要更糟糕?
  1. 首先添加repo并安装:
    $ sudo add-apt-repository ppa:oibaf/graphics-drivers
    $ sudo apt-get update
    $ sudo apt update && sudo apt -y upgrade
  2. Enable DRI3: 在/etc/X11/xorg.conf添加如下
    
    Section "Device"
        Identifier "AMDGPU"
        Driver "amdgpu"
        Option "AccelMethod" "glamor"
        Option "DRI" "3"
    EndSection
这么做会怎样呢?彻底的失败,这个驱动会导致启动死机,我使用nomodeset可以不死机但是显示会很慢,显示加速必需要开.为了去掉之前的ppa我安装了ppa-purge,我是怎么知道的呢?查看/var/log/dpkg.log另一个问题是我的firefox总是不能正确运行,于是我把.mozilla删除了才解决了crash的问题.然后gedit启动后总是类似透明的问题:/usr/share/themes/Ambiance/gtk-3.20/gtk.css设定为
textview text {
   background-color: white;
}
scrollbar {
   background-color: white;
}

@import url("gtk-main.css");
因为gedit非常的慢我只好安装leafpad,这个非常的快.也许就是不同的desktop的原件交叉使用吧?

十一月十八日等待变化等待机会

对于当前计算机架构的存储分级结构最清晰的金字塔图标,不仅仅是速度的概念,更重要的是其中的单位价格信息更加的有用。

Smiley face

稍稍改動了一下之前測試的部分,主要就是增加了msync,其中的參數必須是當前使用mmap獲得的地址。以下测试数据非常的令人印象深刻,其中的运行结果单位是秒。另外选择buffer的size非常的有关系,注意linux的page cache size是4k,因此是否为其倍数直接決定了很多效率。

type

Size of write buffer(1Gib)

Size of write buffer(4Mib)

Size of write buffer(4Kib)

HDD

71

140

30+ minutes!!!

fsdax

21

16

20

dax(char dev of DPMM)

5

13

15

这里有很好的关于文件cache的文章
我对于google拼音非常的有意见,不知是有意无意它里面的某些台独工程师故意把默认的中文绑定为繁体中文,这种小人伎俩非常让人齿冷。我只能使用ibus因为fcix似乎是绑定了。

十一月十九日等待变化等待机会

在regex里有这么一个简单的常识我居然误解了,就是.只有在不在character内才会作为wildcard来使用,这个意思我居然没有理解,就是说[.]+代表的是.作为literal代表若干个.。这个是浅显的东西,真正玄奥的是.+?这里的?是所谓的lazy,相对于所谓的possesive,这个lazy差别非常的大,如果没有这个lazy的话整个的match就是一个也就是说只有首尾的<p style=\"margin-bottom: 0in\">和</p>,而不是挨个的一个一个match。

#include <iostream>

#include <boost/regex.hpp>

#include <boost/filesystem.hpp>

using 
namespace std;

using 
namespace boost;


int main()
{
	string str;
	filesystem::load_string_file(filesystem::path("/BigDisk/diabloforum/public_html/2017.htm"), str);
	ofstream out("/tmp/2017.htm");
	string::const_iterator start = str.begin(), end = str.end();

	regex ex("(<p style=\"margin-bottom: 0in\">).+?(</p>)");
	match_results<string::const_iterator> what;
	
	while (regex_search(start, end, what, ex))
	{
		
	if (start != what[0].first)
		{
			out << string(start, what[0].first);
		}

//		cout << "*******************************" << endl;
		string strRemain = string(what[1].second, what[2].first);

//		cout << strRemain << endl;
		out << strRemain;
		start = what[0].second;
	}
	
if (start!=end)
	{
		out << string(start, end)<< endl;
	}
	
return 0;
}
这里是摘抄一篇关于linux page cache的文章。而其中需要指出的是关于linux内存管理的深度介绍才是更有价值的网站。

1 前言

自从诞生以来,Linux 就被不断完善和普及,目前它已经成为主流通用操作系统之一,使用得非常广泛,它与 Windows、UNIX 一起占据了操作系统领域几乎所有的市场份额。特别是在高性能计算领域,Linux 已经成为一个占主导地位的操作系统,在2005年6月全球TOP500 计算机中,有 301 台部署的是 Linux 操作系统。因此,研究和使用 Linux 已经成为开发者的不可回避的问题了。

下面我们介绍一下 Linux 内核中文件 Cache 管理的机制。本文以 2.6 系列内核为基准,主要讲述工作原理、数据结构和算法,不涉及具体代码。

2 操作系统和文件 Cache 管理

操 作系统是计算机上最重要的系统软件,它负责管理各种物理资源,并向应用程序提供各种抽象接口以便其使用这些物理资源。从应用程序的角度看,操作系统提供了 一个统一的虚拟机,在该虚拟机中没有各种机器的具体细节,只有进程、文件、地址空间以及进程间通信等逻辑概念。这种抽象虚拟机使得应用程序的开发变得相对 容易:开发者只需与虚拟机中的各种逻辑对象交互,而不需要了解各种机器的具体细节。此外,这些抽象的逻辑对象使得操作系统能够很容易隔离并保护各个应用程 序。

对于存储设备上的数据,操作系统向应用程序提供的逻辑概念就是"文件"。应用程序要存储或访问数据时,只需读或者写"文件"的一维地址空间即可, 而这个地址空间与存储设备上存储块之间的对应关系则由操作系统维护。

在Linux操作系统中,当应用程序需要读取文件中的数据时,操作系统先分配一些内存,将数据从存储设备读入到这些内存中,然后再将数据分发给应用程序;当需要往文件 中写数据时,操作系统先分配内存接收用户数据,然后再将数据从内存写到磁盘上。文件 Cache 管理指的就是对这些由操作系统分配,并用来存储文件数据的内存的管理。 Cache 管理的优劣通过两个指标衡量:一是 Cache 命中率,Cache 命中时数据可以直接从内存中获取,不再需要访问低速外设,因而可以显著提高性能;二是有效 Cache 的比率,有效 Cache 是指真正会被访问到的 Cache 项,如果有效 Cache 的比率偏低,则相当部分磁盘带宽会被浪费到读取无用 Cache 上,而且无用 Cache 会间接导致系统内存紧张,最后可能会严重影响性能。

下面分别介绍文件 Cache 管理在 Linux 操作系统中的地位和作用、Linux 中文件 Cache相关的数据结构、Linux 中文件 Cache 的预读和替换、 Linux 中文件 Cache 相关 API 及其实现。

2 文件 Cache 的地位和作用

文件 Cache 是文件数据在内存中的副本,因此文件 Cache 管理与内存管理系统和文件系统都相关:一方面文件 Cache 作为物理内存的一部分,需要参与物理内存的分配回收过程,另一方面文件 Cache 中的数据来源于存储设备上的文件,需要通过文件系统与存储设备进行读写交互。从操作系统的角度考虑,文件 Cache 可以看做是内存管理系统与文件系统之间的联系纽带。因此,文件 Cache 管理是操作系统的一个重要组成部分,它的性能直接影响着文件系统和内存管理系统的性能。

图1描述了 Linux 操作系统中文件Cache 管理与内存管理以及文件系统的关系示意图。从图中可以看到,在 Linux 中,具体文件系统,如 ext2/ext3、jfs、ntfs 等,负责在文件 Cache和存储设备之间交换数据,位于具体文件系统之上的虚拟文件系统VFS负责在应用程序和文件 Cache 之间通过 read/write 等接口交换数据,而内存管理系统负责文件 Cache 的分配和回收,同时虚拟内存管理系统(VMM)则允许应用程序和文件 Cache 之间通过 memory map的方式交换数据。可见,在 Linux 系统中,文件 Cache 是内存管理系统、文件系统以及应用程序之间的一个联系枢纽。

3 文件 Cache 相关数据结构

在Linux 的实现中,文件 Cache 分为两个层面,一是 Page Cache,另一个 Buffer Cache,每一个 Page Cache 包含若干 Buffer Cache。内存管理系统和 VFS 只与 Page Cache 交互,内存管理系统负责维护每项 Page Cache 的分配和回收,同时在使用 memory map 方式访问时负责建立映射;VFS 负责 Page Cache 与用户空间的数据交换。而具体文件系统则一般只与 Buffer Cache 交互,它们负责在外围存储设备和 Buffer Cache 之间交换数据。Page Cache、Buffer Cache、文件以及磁盘之间的关系如图 2 所示,Page 结构和 buffer_head 数据结构的关系如图 3 所示。在上述两个图中,假定了 Page 的大小是 4K,磁盘块的大小是 1K。本文所讲述的,主要是指对 Page Cache 的管理。

在 Linux 内核中,文件的每个数据块最多只能对应一个 Page Cache 项,它通过两个数据结构来管理这些 Cache 项,一个是 radix tree,另一个是双向链表。Radix tree 是一种搜索树,Linux 内核利用这个数据结构来通过文件内偏移快速定位 Cache 项,图 4 是 radix tree的一个示意图,该 radix tree 的分叉为4(22),树高为4,用来快速定位8位文件内偏移。Linux(2.6.7) 内核中的分叉为 64(26),树高为 6(64位系统)或者 11(32位系统),用来快速定位 32 位或者 64 位偏移,radix tree 中的每一个叶子节点指向文件内相应偏移所对应的Cache项。

另一个数据结构是双向链表,Linux内核为每一片物理内存区域(zone)维护active_list和inactive_list两个双向链表,这两个 list主要用来实现物理内存的回收。这两个链表上除了文件Cache之外,还包括其它匿名(Anonymous)内存,如进程堆栈等。

4 文件Cache的预读和替换

Linux内核中文件预读算法的具体过程是这样的:对于每个文件的第一个读请求,系统读入所请求的页面并读入紧随其后的少数几个页面(不少于一个页面,通常是三个页 面),这时的预读称为同步预读。对于第二次读请求,如果所读页面不在Cache中,即不在前次预读的group中,则表明文件访问不是顺序访问,系统继续 采用同步预读;如果所读页面在Cache中,则表明前次预读命中,操作系统把预读group扩大一倍,并让底层文件系统读入group中剩下尚不在 Cache中的文件数据块,这时的预读称为异步预读。无论第二次读请求是否命中,系统都要更新当前预读group的大小。此外,系统中定义了一个 window,它包括前一次预读的group和本次预读的group。任何接下来的读请求都会处于两种情况之一:第一种情况是所请求的页面处于预读 window中,这时继续进行异步预读并更新相应的window和group;第二种情况是所请求的页面处于预读window之外,这时系统就要进行同步 预读并重置相应的window和group。图5是Linux内核预读机制的一个示意图,其中a是某次读操作之前的情况,b是读操作所请求页面不在 window中的情况,而c是读操作所请求页面在window中的情况。

Linux内核中文件Cache替换的具体过程是这样的:刚刚分配 的Cache项链入到inactive_list头部,并将其状态设置为active,当内存不够需要回收Cache时,系统首先从尾部开始反向扫描 active_list并将状态不是referenced的项链入到inactive_list的头部,然后系统反向扫描inactive_list,如 果所扫描的项的处于合适的状态就回收该项,直到回收了足够数目的Cache项。Cache替换算法如图6的算法描述伪码所示。

图6 Linux的Cache替换算法描述
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
Mark_Accessed(b) {
        if b.state==(UNACTIVE && UNREFERENCE)
                 b.state = REFERENCE
                 else if b.state == (UNACTIVE && REFERENCE) {
                  b.state = (ACTIVE && UNREFERENCE)
                           Add X to tail of active_list
                  } else if b.state == (ACTIVE && UNREFERENCE)
                    b.state = (ACTIVE && REFERENCE)
}
Reclaim() {
                     if active_list not empty and scan_num<MAX_SCAN1
                {
                       X = head of active_list
                           if (X.state & REFERENCE) == 0
                               Add X to tail of inactive_list
                           else {
                     X.state &=  ~REFERENCE
Move X to tail of active_list
                                     }
                           scan_num++
                     }
                     scan_num = 0
                     if inactive_list not emptry and scan_num <
                MAX_SCAN2 {
                           X = head of inactive_list
                           if (X.state & REFERENCE) == 0
                               return X
                           else {
                   X.state = ACTIVE | UNREFERENCE
Move X to tail of active_list
                                     }
                           scan_num++
                     }
                     return NULL
}
Access(b){
                     if b is not in cache {
                       if slot X free
                       put b into X
                       else {
                       X=Reclaim()
                       put b into X
                           }
                       Add X to tail of inactive_list
                     }
                     Mark_Accessed(X)
}

5 文件Cache相关API及其实现

Linux 内核中与文件Cache操作相关的API有很多,按其使用方式可以分成两类:一类是以拷贝方式操作的相关接口, 如read/write/sendfile等,其中sendfile在2.6系列的内核中已经不再支持;另一类是以地址映射方式操作的相关接口,如 mmap等。

第一种类型的API在不同文件的Cache之间或者Cache与应用程序所提供的用户空间buffer之间拷贝数据,其实现原理如图7所示。

第二种类型的API将Cache项映射到用户空间,使得应用程序可以像使用内存指针一样访问文件,Memory map访问Cache的方式在内核中是采用请求页面机制实现的, 其工作过程如图8所示。

首 先,应用程序调用mmap(图中1),陷入到内核中后调用do_mmap_pgoff(图中2)。该函数从应用程序的地址空间中分配一段区域作为映射的内 存地址,并使用一个VMA(vm_area_struct)结构代表该区域,之后就返回到应用程序(图中3)。当应用程序访问mmap所返回的地址指针时 (图中4),由于虚实映射尚未建立,会触发缺页中断(图中5)。之后系统会调用缺页中断处理函数(图中6),在缺页中断处理函数中,内核通过相应区域的 VMA结构判断出该区域属于文件映射,于是调用具体文件系统的接口读入相应的Page Cache项(图中7、8、9),并填写相应的虚实映射表。经过这些步骤之后,应用程序就可以正常访问相应的内存区域了。

6 小结

文件Cache管理是Linux操作系统的一个重要组成部分,同时也是研究领域一个很热门的研究方向。目前,Linux内核在这个方面的工作集中在开发更有效的Cache 替换算法上,如LIRS(其变种ClockPro)、ARC等。相关信息可见 http://linux-mm.org/AdvancedPageReplacement


十一月二十一日等待变化等待机会

这里是迄今为止我看到的最完整的HOMM3的资料库。而这个地图网站是我看到的最庞大的HOMM的地图网站。而且这个网站防止自动下载做的太厉害了,我手动下载都经常被拦截,这个站长赚钱是理所应当的。我实在是太喜欢这个列表了。所以,就保留了一个拷贝。同样的这个魔法列表也保存一个拷贝。这里是英雄列表我保存一个拷贝
在这里我找到了一片很新奇的论文。我保留一个拷贝。关于文本编辑器我现在全力推荐geany,这个原本用来编程的工具被用来写文档是绰绰有余的。
关于linux的可执行程序附带资源文件的问题其实也不是什么了不起的问题,之前我已经有了一个概念就是使用所谓的objcopy,但是现在看到直接使汇编文件感觉这个办法更加的好。因为首先,代码里调用能够任意使用symbol而不是非常搞怪的一个搜索section名字,这一点我一直不愿意因为这个做法太糙了就好像http-post里使用的做法一样土。其次,我可以任意使用多个资源文件和多个symbol,很灵活的。 这里的这篇帖子很值得阅读:

    .global blob
    .global blob_size
    .section .rodata
blob:
    .incbin "blob.bin"
1:
blob_size:
    .int 1b - blob
这个就是汇编文件blob.S的代码,虽然我自认为汇编水平很粗浅也明白基本的东西,incbin是如同include一样的,而所谓的1:是一个placeholder一样的tag,实际上是用来记录地址的。所以这个还是通俗易懂的。我照猫画虎做了一个示范
关于.webm转为.mp4需要重新生成pts:ffmpeg -i input.webm -fflags +genpts -r 60 input.mp4
又找到一篇关于HOMM3的论文,看来这个经典游戏有着极高的人气值是无数人研究的对象,本地存一份。这里的文章和之前有关联,其中关于各个英雄的学习技能的可能性的矩阵表非常的棒!
关于homm3的伤害值的计算公式,这里是一个不错的所在。

==Damage Calculation==
=== Heroes III ===
The total damage inflicted in an attack is calculated after the following formula:

Total Damage=Base Damage*(1+a+b+c+...)*(1-q)*(1-r)*(1-s)*...
, where:

*Base Damageis the damage displayed for every unit in its stats (e.g. 40-50 for a Black dragon). For a stack of creatures it is calculated as such:
**If there are less than 10 creatures in a stack, then a random integer is chosen from the damage range for each creature, and these numbers are added up.
**If there are more than 10 creatures in a stack, 10 random integers are chosen from the damage range of the creature type and are added up. The result is multiplied by n/10, where n is the number of creatures in the stack, and rounded down.
The probability that the resulting base damage will be in the lowest/highest 25% is (approximately):
*a,b,c,... are damage bonuses
*q,r,s,... are damage reductions (both as decimal numbers)

====Damage Bonuses====
SituationBonus
Attacker's Attack Skill > Defender's Defense Skill 0.05 * (A - D), (capped at 3)
Attacker is a shooterArchery Skill bonus, additive with artifact bonuses if Archery skill present
Attacker is a Ballista, does double damage1
Attacking hero has Archery specialty0.05 * Hero level * Archery bonus
Attacking hero has Offense SkillOffense skill bonus
Attacking hero has Offense specialty0.05 * Hero level * Offense bonus
Attacker gets Luck1
Attacker is a Cavalier/Champion0.05 * hexes traveled
Attacker is an opposite Elemental type1
Attacker 'hates' the Defender0.5
Dread knight's double damage1
Bless specialty Hero, Bless is cast0.03 * Hero level / Unit level
====Damage Reductions====
SituationReduction
Defender's Defense > Attacker's Attack0.025 * (D – A), up to a maximum of 0.7
Defending hero has ArmorerArmorer skill bonus
Defending hero has specialty Armorer0.05 * Hero level *Armorer bonus, additive with base Armorer bonus
Defender has Shield spell appliedShield spell bonus
Attacker is a shooter, Basic Air Shield is cast0.25
Attacker is a shooter with range, wall or melee penalty, or Advanced Air Shield is cast0.5
Attacking a petrified unit0.5
Attacker is Psychic Elemental, defender is immune to Mind spells0.5
Attacker is Magic Elemental, defender is Magic Elemental or the Black Dragon0.5
Unit retaliates from Basic Blind state0.5
Unit retaliates from Advanced Blind state0.25

十一月二十二日等待变化等待机会

找到一篇专业的战略分析文章(HOMM3),存下来以后研究。

Coyot's HOMM3 Guide — Strategy

A guide that contains all general tricks and rules that apply to all classes.
  1. Exploring your vicinity

  2. In most cases it is very wise to purchase at least one more hero to your starting one. Of course, if you have an observation tower in front of your castle and can see from there that all ways out are blocked by some stronger armies and you only have a little space to explore, a second hero won't be needed... yet.

    1. Main advantages of twoor-more-heroes exploration:
      1. In early stage - that 'round the castle' exploration - you will usually profit from having two heroes by having more goodies gathered quicker. Applies to mines, resources, artifacts and even the fact of uncovering the map. [More heroes can be used, if your financial situation seems good and exploration is possible. These guys usually pay back quite soon.]
      2. Further in the game, when you assemble a task force strong enough to go after your enemy, a second guy in backup can be used for several things:
        • takes care of resources, mines and full exploration of conquered land (here I often purchase new heroes at conquered castles, because within a few days the money is back in resources from taken mines.)
        • holds backup armies for your main hero
        • can build up quite solid experience
        • can take out some of the wandering armies
        • can take care of some minor enemy heroes/towns
    2. How to select your primary/secondary hero?
      With the exception of necromancer, there's no need at all to use a hero corresponding with the castle class (and even with the necro castle, you should consider buying another type if there's not any large quantities of skeleton reserves (aka 1stlevel troops) scattered around the map). However, a hero aligned to your starting castle will bring in troops that could be of use! Sometimes (when a ranger comes with two stacks of dwarves or a necro with walking dead) you may ignore this benefit. So, how to make your choice?
      1. Primary hero/castle: Since both might and magic heroes exist in each castle and every hero is unique, choosing primary castle and primary hero is quite complicated. First, for the castles:
        • Castle: Balanced troops, rather slow, more emphasis on the top troops.
        • Rampart: Rather fast troops, good specials, best 1stlevel troop.
        • Tower: Good specials, strong 7thlevel troop, numerous 1stlevel.
        • Stronghold: Offensive troops, cheap 7thlevel troop. Good 5thlevel.
        • Fortress: Defensive troops, nasty specials - dangerous 7thlevel troop.
        • Dungeon: Rather offensive troops, best 2ndlevel troop. Black dragons ;-)
        • Inferno: Offensive troops with two non-retaliated attacks and good specials.
        • Necropolis: Good specials, cover of darkness structure.

        Second, for the heroes:
        I prefer might heroes to the magic ones, because in most cases the might ones still get some experience in magic. Might heroes can develop magic schools, and since bigger armies benefit more from misc. spells than from damage spells, with the only exception being Armageddon ;-), might heroes should be better in the long run, when they can get enough spellpower and knowledge to be able to cast Bless, Slow or Haste every now and then.
        Necromancer may be worth it if large amounts of skeleton producing wandering armies are located nearby - the recruited skeletons make excellent garrison troops.
        My usual practice is to hire a might hero (unless I have access to some high-level magic and/or magic skills boosting structures and artifacts) and concentrate the experience with him. In many cases I manage to get Wisdom, so I can use some good spells from captured castles. In fact, probability of not getting wisdom for might hero is next to zero.
        There are only two situations where a strong magic hero is better: Both of them are based on Fire spells. One of them is Armageddon, but that requires the magic hero to also have some spell-resistant unit. The other one is Berzerker at Expert level. With it, up to 5 armies are often forced to fight each other and the only solution will be mass Cure or Dispel.
        Direct damage spells (maybe except Armageddon) aren't a big threat, because with large armies the damage caused by those can be compensated by supportive spells on your troops - with big enough armies, MassBless beast ChainLightning. (Of course, you need an expert magic school for that...)

        Getting secondary skills:
        CheckWay ofparticular class to see which skills suit the best your troops and strategies.
        There are some general rules to follow. See description of advanced skills below for most of them. Many of the skills are just generally good (Luck, Leadership) while others are case-specific (Mysticism, Wisdom, Logistics, Ballistics) where the conditions cover anything from Map size, number of castles, type of main hero, type of most used troops, amounts of resources available and all other things up to terrain types.
        Of course, the hero class is very important. I.e. Leadership, Offense and Archery provide much better advantage to Might Heroes, whose troop damage ratings (that profit from their skills) are generally high. On the other hand, for a spell caster skills like Expert Wisdom are a must, because those expand his strong features.

      2. Secondary hero:
        [In the following, percentages for level advancement are averaged for heroes with 20 levels! The advancement in first 9 levels is a bit different, generally with more emphasis towards the skills native to that hero. So that barbarians get 55% levels on attack below level 10 and only 30% later... yet I used the averaged number.]
        1. Defensive purposes:
          For castle garisson with troops that will do little apart from just standing behind the walls while the towers shoot the enemy, BeastMasters are the best, with 45% of their advancements going into Defense Skill, followed by Rangers, Knights and other might heroes.
          However, if your castle troops are bound to use offense (i.e. shooters, fast flyers or tough walkers), you might care for a barbarian with 43% of levels gained raising his attack. But, on the other hand, if the castle you're defending has a mage guild with some supportive spells (namely Cure/Dispel), you might try to get a hero with some knowledge to be able to remove negative spells like Blind.
        2. Offensive purposes:
          If you need the hero to become another primary hero, you already know the algorithm for choosing primary heroes.
        3. Miscellaneous (aka secondary) purposes:
          The true secondary hero should be of some use. A scout/picker should have fast troops. Check the hero recruitment for details - you're looking for Logistics, Pathfinding and/or scouting, especially for those guys that will peddle troops to front lines.

        Since the heroes now have different secondary skills, there is no recommendation like 'I usually take warlocks'... however, there are several all-round good picks:
        • Caitlin, Clavius, Octavia, Nagash, Damacon, Jenova, Aine and Lord Haart are moneymakers (all get +350/day, only Haart has Basic Estates with 5% bonus, so he'll need a few levels...)
        • Thane with Advanced Scholar makes a good spell-transporter, but there are others with basic scholar - a few learning stones or trees of knowledge will be on almost every map.
        • Voy is the super-sea scout with navigation and 5% bonus.
        • Deemer has Advanced Scouting - a good pick for early exploration.
        • Rissan, Calid, Saurug and Sephinroth produce 1 magic resource per day, each different. Remember that 25 days with this hero will pay back with a few marketplaces, if you don't need the resource.

        Also remember that the two-hero strategy doesn't implicate use of ONLY two heroes, but AT LEAST two heroes. The larger the map, the more heroes may be used. Generally in L or XL maps, at least one secondary guy should have aggressive ambitions, should visit all the skill boosts and pick good secondary skills, because his time will always come.
        The so-called backup guy should be taking his share of experience. Especially later, when the main hero will need large amounts of experience to advance, it's usually better to let the chests and minor battles to the backup guy, because he'll profit them much more. The only exception should be the case when the main guy desperately NEEDS to upgrade some advanced skills.)
        The main purposes of the backup guy are carrying reinforcements, killing minor enemies and wandering armies and sometimes possibly softening up enemy resistance, even at the cost of sacrificing himself.
        If the secondary hero survives, he might often 'inherit' all the main hero's armies - especially if you happen to capture enemy castle with some 7th level troops left inside unpurchased. In that case the main hero buys all the big guys and those often carry more firepower that his entire regular army. In this moment, the backup guy becomes sort of another main hero and at those moments you really appreciate that you've already let him to get some levels.)
        [Or, if he's still pretty weak, you might consider giving HIM the dragons, who tend to take minimal losses, grab some quick experience by defeating wandering armies etc. - and later switch the armies. This applies in cases where the main hero's army relies very much on his skills and with the weak secondary guy would suffer many losses. A good example would be a knight with archery, luck and leadership, while the secondary guy is a spellcaster...]
        There might be a bunch of 'tertiary' heroes that can take care of collecting resources, gold and armies from self-refilling sources (usually one hero can do that easily) and for shuttling reinforcements to the front lines. Those guys need no experience, but should (if that doesn't mean BIG delays) visit learning stones and possibly trees of knowledge (if they're for free) to try and get scouting, logistics or estates). Also, buying a hero to scout and claim mines near a captured castle is usually worth the 2500 gold. The fresh information from scouting (even if he dies there) may navigate your main guy towards new enemies. And with some marketplaces, even a few days' worth of those mines production will repay the gold back if needed).
    3. Task sharing between the heroes:
      The main hero should spend his time doing only things that number two cannot do himself. The former are usually battling armies, getting artifacts and treasure chests, while the latter are usually occupying mines, picking up resources and exploring Witch huts. Of course, things like visiting learning stones and other skill increasing facilities are common. One more thing should the second guy do - probe trees of knowledge, because they're more valuable later in the game and you don't want to waste their potential in the beginning, when the needed experience can be achieved from a few chests.
    4. Experience and advancement handling:
      1. In witch huts you can get random Basic skill. [Random in sense that level creator cannot affect which skill will it be, but every time the level is started, the skill is set and remains the same for the single playtime.] So, the backup hero has one important role: If your main guy still has some secondary skill slots free, his friend will check every witch hut and sacrifice his free slots to learn what the hut has to offer. Sometimes you might even buy a new hero, if the backup guy has ambitions to become another primary hero.
      2. Treasure chests and other sources of experience. As the experience needed to gain a level rises with levels already gained, it is pretty obvious, that if you have a hero with some 40000 exp. and his backup guy has some 3000 from stones only, then when ever you run into a treasure chest, you'd better take it by the second guy... Coz getting another +1 skill for your main guy is usually not as good, as getting some +6 for the same price... Okay, sometimes you take them with the main guy, because you desperately need some advanced skill to be upgraded...
        [Here I strongly recommend checking every now and then, how much the hero needs to advance to next level. Even with 100000+ experience points, the hero sometimes is just a 'chest-far' from next level.]
      3. The same thing comes with wandering armies... If the backup guy can take them out, let him do it and leave only hard targets to your main guy. There is one exception... If the main guy has a strong army, it is a good strategy to leave him one empty stack when touring the landscape... (The backup guy will carry the seventh army in case it would be needed for a castle siege or enemy hero attack). Thus, wandering armies can join you. So, if you happen to have a force strong enough to defeat any wandering army, approach those you'd like to join you (or buy, having advanced/expert diplomacy) with the main guy. If they will join you, you can either use them immediately or give them to the backup guy, if they don't fit for the main guy's alignment, combat style or speed. Those armies can be then used by the backup guy as forces to kill other wandering armies and towns for experience, and later maybe to be left as garrisons in captured castles and towns (ogres and similar armies are excellent for this purpose, because their high HPs will either prevent any weaker heroes from attacking your towns or, at least, cause some losses to enemy armies.)
      4. One more comment on treasure chests - always bear in mind that they're also a good source of GOLD! If you plan to rely on diplomacy or build very expensive structures, it's worth it to take gold with the secondary heroes. In early game, I take all the 1000 gold/500 exp. chests for gold and with the better ones, I go for experience. Later I shift up and take gold even from the 1500/1000 ones.
      5. Also, if your secondary hero explores a closed area that you don't want to visit with the main guy (maybe there's nothing important except a few chests, or he's in a hurry to somewhere else), you might grab gold or exp. with the secondary guy even at early stages of the game, where the main hero could profit them, but it'd cost too much time.
      6. All other sources of experience and skills can be used by more heroes (stones, primary skills increasers and tree of knowledge, which has the same price for all heroes (decided at startup, can be nothing, 10 gems or 2000 gold)). In case that a Tree of Knowledge wants no payment, the level advancement is done immediately (which is why it's better to try those with the backup guys first, because using a tree of knowledge before you already have some 50000 exp. is a real waste - in case that the main hero will probably be near this tree later in the game - if you're sure you'll never get there again, grab the level!.
      7. Harpies at sea and altar of sacrifice are a good thing. No longer will you just carry around unneeded artifacts and troops that have joined you yet you don't want them. Imagine this: Your hero with some massive army is offered i.e. a throng of demons. You accept, give them immediately to the secondary guy who sacrifices them, getting some two, three levels for them... and this can happen pretty often.
    5. Artifacts
      Since the hero now has the backpack and can un-equip artifacts, handling them is much easier. There is one general rule: most of the artifacts COULD be needed later, even if you don't want to use them. Equipping the Spirit of Oppression (that prevents morale bonuses) would be a nonsense when you have positive morale. But you might run into an enemy that has expert leadership, three medals, visited temple and likes to cast Mirth on expert level. Needless to say, in that battle you'll gladly use the morale preventer.
      Since you can now weild only one weapon etc., your backup heroes will probably get some from the captured artifacts, too. Also, only four misc. artifacts can be equipped, so it's imperative that you transfer all gold and resource providing stuff to some secondary guy.
      Unwanted artifacts are disposed of at altars of sacrifice, already mentioned above.
    6. Movement rules
      Movement bonuses for fast armies (the slowest troop determines the bonus) are:
      0 for Extra Slow (4)
      1 for Slow(5)
      2 for Swift or Extra Swift (6-7)
      3 for Very Swift(8)
      4 for Ultra Swift or Super Swift (9-10)
      5 for Quick and faster (11+)
      Therefore, having Arch Devils (17) instead of Devils (11) won't help you, but you'll outrun a knight with Archers by 5 tiles/day. The default value is 15 tiles.
      [Just came to my mind that this could be a good way to waste your opponent's time - once the enemy wanders onto the edge of your territory, get close enough to him to let him see you. But be sure to have very fast armies, possibly logistics or a movement increasing artifact... and every turn, keep just a few steps ahead of him... And run around the map, wasting the time of the powerful enemy army...
    7. Also, use always appropriate heroes! In rough terrain, try to hire somebody with PathFinding, on high seas try hard for Navigation. Get to remember what skills do the nearby witch huts give, let secondary heroes get some experience ASAP to determine the purpose they'll be used for.


      Every hero has eight slots for advanced skills. Therefore you have to be picky and choose only those you'll profit from. The simple fact that your hero would profit from a skill doesn't mean that you want to take it... There always could be more skills to profit from. A good example may be a knight that gets offered Advanced Leadership and Basic Archery. A newbie will snatch the archery, while the Pro will stop here, think about waiting for wisdom/logistics with the free slot and therefore will take the advanced one. This waiting is generally affordable in cases where the basic offered to you isn't a rare skill for your hero, or in cases where you don't need it any badly right now. And the archery in this example is a good one, since knight has only two shooters of which one is frequently targeted by all enemies.

      • Ballistics: A must for walker based armies, in every scenario with lots of castles it will save you some losses to your vulnerable shooters... Either by destroying enemy castle installations or just making way for your armies to run in and slay the defending troops sooner. Not needed much for flyers/shooters based armies, such as dragon or titan task forces. Expert ballistics increases chance of hitting arrow towers to 75% (until all walls are destroyed).
        However, there is one big advantage of ballistics: Since you get manual control of the catapult and the catapult always plays first in the battle rounds, you can cast a spell before the opponent gets his fastest troop to move. Favorite spells are Slow/Haste/Earthquake.
      • Artillery: Might heroes profit from that skill pretty well. It has proven worthy in many battles, especially in the earlier stages of the game. It's firepower can be immense.
      • Archery: Increased damage from ranged attacks usually comes in pretty handy, unless you have small stacks or none shooters at all. (I.e. Fortress, Necropolis).
      • Tactics: This is the BEST skill for me, no doubt about that. Two things should be separated:
        1) The fact that your walkers can start their action at 7th row - when your armies are faster than enemy, this is the deadliest skill in the game, because even relatively slow walkers (6-7) can strike in the first turn - so if everything goes well, the enemy doesn't get to play at all! Especially well used with Rampart or Castle troops.
        2) Then there is the ability to rearrange your troops upon seeing the enemy 'configuration' - and it's always nice to be able to move all your troops out of range of the enemy fast flyers! I've won many battles ONLY because I had this skill.
        Only in castle defenses this skill is less useful, as the point 1) doesn't apply - but point 2) still does!
      • Offense: A very potent skill for all Might heroes, especially those that depend on aggresive approach.
      • Armorer: Smaller brother of Offense, a bit less useful, but definitely beats Mysticism and Eagle Eye ;-)
      • First Aid: This could be an important skill for cases where enemy fire is concentrated on high-level units. If you're walking around the map with your 7thlevel creatures only, the First Aid Tent with 100 points of healing can prevent losses - it successfully eliminates castle towers' fire (towers usually score about 100 points of damage.)
      • Luck: Causing double damage can't be a bad thing, ever ;-), but it's never a thing I'd wait for to come. And, the damage is not exactly doubled, it is increased only by an average raw damage of the stack, without A/D included.
      • Leadership: Can sometimes save you a lot of trouble and losses. I prefer it to Luck, as it gives you more tactical benefits. (You can let enemy units share the double damage you're granted.)
      • Learning: An useless skill. For a wasted slot, you only get one level ahead of opposition.
      • Logistics: A must, except for Small, sometimes Medium and those island realms scenarios.
      • Pathfinding: A must, whereever large desert, swamp or snowy areas require exploration. Otherwise useless.
      • Navigation: Don't have to tell you that you won't need that, when you have no seas/rivers to conquer, do I? ;-)
      • Necromancy: Not of much use, unless you manage to get expert one and find large masses of cannon fodder such as trogs. However, in necromancer's hands it's a powerful weapon.
      • Resistance: This skill is stronger than it looks. Especially heroes that have it as a special are dangerous opponents, because a spell resisted at the right time can decide whole battle. If you manage to get some artifacts, enemy spellcasters will have a hard day then. On the other hand, results are not guaranteed and therefore I usually pick other skills - those that work always, not by percentage.
      • Diplomacy: If you get it in early stages, comes quite handy, coz with expert level of this skill, you get quite a fair price for new units. [And you can see how many they are, before fighting them... and hell, there IS a difference between 'lots' and 'lots', 20 and 40 ogre lords definitely aren't a bit the same easy bait. A few times, having adv/expert diplomacy saved my life, because I could avoid fighting a stack of grand-elves, that would have beat crap out of my armies.] There's another VERY important aspect of diplomacy - surrendering is cheaper! At expert level, you have 60% discount, which can be pretty important in saving crucial parts of your army.
        And another thing that I tested: Every level of diplomacy gives you TWO levels' discount on the requirement to enter Library of Enlightenment! So that if your hero at level 4 has Expert Diplomacy (or at 6 Advanced or at 8 Basic), he'll be let in, getting +2 to all skills! This can be very useful, if the library is near to your castle (and even more, if you can learn Diplomacy in a hut. You might then want your secondary heroes and/or castle defenders to get the Diplomacy, grab some experience and get this +8 bonus...
        From Gus at NWC we've learned exactly how Diplomacy works. Here's the mail quoted (and slightly reworded):

        Here's how it works, in a nutshell.
        Diplomacy makes it more likely that:

        • Monsters will join you for free.
        • Monsters will join you for money.
        • Monsters will run away (avoid fighting).

        Each monster has a aggression level, set at the start of the game. You can adjust the range of aggression in the editor, but there's always a random factor, unless the monster is "compliant" or "savage."
        When you encounter a monster group, the monsters first decide whether to fight. The depends on your strength, how friendly they are, your diplomacy skill, and whether you have similar creatures in your army.

        If they are friendly or ambivalent, they may decide to join you. For normal monsters, 10% of the time they'll join you for free. Every level of diplomacy makes this 10% more likely. Having similar creatures also makes this 10% more likely. Having an army that is mostly similar creatures makes it 20% more likely.

        Example: Sir Christian has an army composed exclusively of pikemen, and Expert diplomacy. Normal pikemen and halberdier stacks will join him roughly 60% of the time - if they don't fight.
        Diplomacy also allows you to persuade monster stacks to join for money.
        You always get the full stack, and the price doesn't change. Each level adds 10%.
        Of the pikemen stacks that Sir Christian encounters, another 30% will offer to join for money, again if they don't fight. Only 10% of them will flee without offering to join. Sir Christian's a really likable guy.
        If the creatures are ambivalent, and you turn down the offer to join, they'll fight you. If they're afraid (or impressed by your silver tongue), they'll run away. Diplomacy never makes it more likely they'll fight. If you turn down an offer and get a battle, they would have fought you anyway.
        This is for normal, "Aggressive" monsters. If they're set to "Complaint" in the editor, they'll join anyone. If they're "Friendly", they're 30% more likely to join. If they're "Hostile", they're 30% less likely to join - you'll need Expert diplomacy, or similar creatures, to have any chance of them joining for free. If they're "Savage", they won't join anyone.

        [End of quotation.]

      • Scouting: Useful for exploration, coz you'll be able not only to see resources and mines you're looking for, but sometimes you'll spot your enemy soon enough to turn back or even spot him without him spotting you.
      • Eagle Eye: Learning spells from your enemies is a good concept, especially if you don't have enough resources to build your own guilds. But I personally prefer other skills and take this only when I either have to or know I'll need it. (Being offered Navigation on land-based scenario, it's really lesser evil to take Eagle Eye.) Of course, there might be situations where you could even RELY on this skill. Imagine a map that is short on magical resources and yet you've been forced a magic hero...
      • Estates: I never take estates for my main hero, but the secondary guys would be foolish to not take it. If there's a witch hut, all the army and resource couriers should visit first and then get some levels at stones and trees to upgrade.
      • Intelligence: More than effective replacement of Mysticism. Top spellcasters can then make long journeys and be generous with offensive spells ;-). Also, Might heroes that won't ever be good at knowledge should consider taking this skill - if only to have a decent spellpoint reserve for Haste/Slow/Dispel - which are pretty important. Especially dispel may be needed pretty often against the way too often blinding AI.
      • Sorcery: A must for offensively thinking spellcasters. the 15% increase of damage is a great thing.
      • Scholar: This is a great skill for secondary heroes!. You won't have to return home with your main hero to learn spells from newly built guild - the secondary guy will take the knowledge on the road and share it with all others.
      • Mysticism: Could be useful for long journeys without castles, if you need to use magic. The better knowledge your hero will have, the lesser need of that skill you have. However, with magic schools that reduce spell costs, I rarely use mysticism - sometimes with Might heroes that need to cast Haste/Slow/Dispel often and don't want to waste a slot on a magic school (because their most used spells come from different schools each.)
      • Wisdom: Unless you build higher level of mage guilds, you won't find much use to it, but take this rather than other, worse skills, coz somebody might build the guild levels for you (either in your or his city). In large scenarios, even magic-hating/fearing knights/barbies should take this one - it will give them access to spells in conquered castles and spells like Berzerker may greatly improve their battle efficiency, not to mention getting i.e. TownGate. However, I would recommend some planning here - to try and get some magic school, too, as wisdom alone will not get you the real power of the spells. [I.e. Town Portal without advanced Earth magic sucks.]
      • Magic schools: I don't want flamewars about which school is better, so I'll let anyone make their choice... All damage spells have increased base damage (usually it's + 10,20,30 or 10,20,50), elemental summoning has doubled effectivity. Protection from another school is 50% for all friendly units..
        I will use the following syntax: Interesting spell (effects), applying for Expert level of that school.
        • Air magic:
          • Haste (All +5 speed) - a decisive factor
          • View Air (Artifacts, heroes, Towns) - great for hunting down enemies
          • Disrupting Ray (Defense -5)
          • Precision (All ranged +6 Attack) - that means +30% damage!
          • Air shield (ranged damage taken is reduced to 50%)
          • Counterstrike (All +2 retaliations!) - great for armies with specials!
          • Dimension Door (4 jumps/day)
          • Fly (100% of normal movement)
          Important spells: ChainLightning, Hypnotize (to use enemy's small stacks to waste retaliations of his top units.)
        • Water magic:
          • Bless (All Damage is Max(damage)+1)
          • Cure (Cures all friendly)
          • Dispel (Dispels ALL spells on the battlefield)
          • Weakness (All enemy -6 A)
          • Forgetfullness (All enemy ranged attackers don't use range attack)
          • Teleport (Can teleport anywhere, incl. castle.)
          • Clone (Works on 7thlevel troops too)
          • Prayer (All A,D and speed +4)
          • Water Walk (100% of normal movement)
          Important spells: Frost Ring, Summon Boat, Scuttle Boat
        • Earth magic:
          • Slow (All 50% speed)
          • Stone Skin (All +6 Defense)
          • View Earth (Entire terrain and mines/resources)
          • AntiMagic (Immunity to all spells)
          • Town Portal (Can choose destination)
          Important spells: Animate Dead, Resurrect, DeathRipple, Meteor Shower, Implosion (aka the only Dragon killer around ;-))
        • Fire magic:
          • Bloodlust (All +6 Attack)
          • Blind (Doesn't retaliate when blinded and attacked)
          • Berserk (All in 19hex radius affected!) - total insanity!
          • Frenzy (Target has A increased for 2xD)
          • Slayer (+8 A against all 7thlevel troops)
          Important spells: Armageddon, Fireball, FireWall, Inferno, Sacrifice!

      Favorite secondary skills are a tough part.
      For a might hero, I would suggest Tactics, Offense, Leadership, Ballistics. Other slots might be used for Logistics, Wisdom, one magic school and maybe Intelligence, but skills like Diplomacy, Artillery, Archery, First Aid, Pathfinding or Learning are applicable as well.
      Magic heroes should have Wisdom, at least one but prefferrably more magic schools, possibly Sorcery and Intelligence. They might use Tactics as well, Logistics needn't be mentioned.
      Of course, you might have a Super-duper spellcaster with Sorcery, Wisdom, Intelligence and four spell schools - and still one slot left. Or you could have a mega-might freak with Offense, Armorer, Archery, Leadership, Artillery, Ballistics and Tactics - and still one slot left.

      In general, when deciding about a particular skill, you should always bear in mind particular conditions in that particular game. You won't need Estates, if you have a few goldmines/endless bags. You won't need Diplomacy, if you're very short of gold or if there are only a few armies that you would want to join you. You won't need Pathfinding on grassland landmasses. You won't need Logistics in a Small map. You won't need luck with strong armies. You won't need Intelligence if there are enough castles and wells in the map...

      If there are any skills that are useful 'always', then I would vote for Tactics - it will bring results in the first week - no building required, no special map conditions (size/terrain/armies) needed etc.

    8. Spell Casting - adventure spells
      When an adventure spell is cast, ALL your heroes profit from it. So, if you have some spare guy at home, cast all view spells with him, sitting in his castle.
      I don't have to mention profits of dimension door, do I? Remember one thing, though. Plan your DD journeys well, considerings wells ;-). It's good to be able to travel half the map in one day, but is even better to end your trip with almost full spellpoints, in case you might need them - i.e. to travel back or use powerful offensive magic to repel an enemy that you've gotten near to. It's always nice to see a computer hero DDing onto a little island and then waiting a week to regain ten points to get back - especially if he has his pack of titans with him and not in his home castle... Also bear in mind that DD is limited to 2/3/4 jumps/day - so make sure that if you can cast it twice, you won't end near somebody you don't want to face in battle!
      The town portal is even bigger bastard than DD. Suppose you capture an enemy castle with some heavy losses - you warp home, get new armies, run to a well, that often is near your starting castle, and then, warp back to the newly captured castle, all in one turn!
      [I personally began to dislike DD and Town Portal, finding them very unbalancing to the game.]
      Dirty Trick: Hit'n'Run:
      Although the following is nasty and almost cheating, you can do this: Having some very fast units in your castle, an enemy nearby with a huge army and some offensive spells in your guild: Split your very fast units into 7 stacks, if possible. Just have the seven stacks occupied by 1 unit each. Attack the enemy, demolish his stacks with spells, retreat, hire again and repeat. If you'll have a well nearby, you can do this as long as you have the cash. Having 7 stacks of 1 unit often allows you to cast 2 or even 3 spells per battle. (Remember, you must FLEE, don't get killed!). This may be applied even with many starting heroes, but the process is generally more costly. And anyway, it's considered as 'dirty' by many players.
      Softening up an enemy before attack/defense may have another aspect - if you use strong enough army, you might force the AI to cast spells and thus waste spellpoints. If you sacrifice one or two armies this way, in the following castle siege the AI may not have enough spellpoints to caste annoying Blind on your shooters.
      This way, you can reduce the enemy's army to quite poor numbers and then, either fortify at your castle, or even slay him with your main army, that was resting in your castle all the time (and maybe providing the very fast units for those repeated attacks). Note that sometimes it would be better to use non-offensive spells, eh, I mean berzerker ;-))
      In early exploration, casting View Earth or View Air may have crucial impact on your success. Seeing mines is often helpful, because there are often scattered respective resources near a mine, so it's important to be there sooner than your opponents.

  3. Map structures and other information

  4. This section is a shortened version of Christoper Nahr's HOMM3 Manual Addendum, rewritten and published with author's permission. The originalis recommended for most players as required reading...
    • Garrison heroes cannot be traded with - you have to click on the castle, move the hero to 'visiting' slot and then manually initiate a trade with another hero. Garrison heroes cannot be anyhow selected from the map. Putting a hero to the garrison has therefore effect similar to 'sleeping' him (the tent icon), but not only you don't get him selected by 'next hero' icon (or 'n' key), the hero isn't even visible in heroes column.
    • If there are both visiting and garrison hero present and enemy attacks, the visiting hero is fought on free terrain. After the battle another 'click' on the castle is needed to fight the hero inside.
    • [Therefore is may sometimes be handy to purchase a visiting hero - just to hold the enemy one more step - if his planning was tight, it might happen that he doesn't have the one extra movement point left and you get extra turn - and even on other days than day 7 this might be good - you might upgrade a dwelling, get some beneficial building or increase fortification level...]
    • If there is a visiting hero (that is visible in the castle entrance) and no hero inside, the garrison armies are merged into the hero's army, what cannot be merged is unused (just as in HOMM2) - and if the defender wins the battle, the armies are still there, if he loses, they're lost without fight. Therefore, be sure to check whether some castle army shouldn't be better transferred to the hero right before the siege - I don't know exact algorithm for choosing armies from the garrison slots to be merged into the hero's army, so if you want to be sure, better do this manually.
    • Creature Dwellings dwellings offer a fixed number of creatures, replenished once per week, to any visiting hero whose flag will then be planted at the dwelling. The actual number of creatures offered is equivalent to the base production of that creature type in their native town type. Any creature dwelling that flies your flag will also add one creature to the weekly production of all of your towns with a corresponding creature generator. This means that if you have flagged three Cursed Temples on the Adventure Map and two Necropolis towns with a Cursed Temple already built, each of these towns will offer three additional Skeletons per week. To fully exploit a creature dwelling you will have to keep sending heroes there once per week. The bonus creatures added to a town's production will accumulate along with regularly produced creatures until you choose to buy them, but any creatures that were not fetched from a dwelling in any given week will be lost. The Dungeon Town's Portal of Summoning duplicates a random, flagged dwelling's weekly production by offering the same number of creatures again that is also, and additionally, available at the dwelling itself. However, in this case too unhired creatures will be lost.
      First-level creatures will join a hero for free when invited at their dwelling but come at the usual price when added to a town's production. Creatures purchased directly at a dwelling are always of basic quality whereas creatures added to a town's production are hired at the same quality (basic or upgraded) as the corresponding town structure. Their price changes accordingly.
    • Cursed Ground. No spellcasting is possible and all positive luck and morale modifiers are removed.
    • Magic Plains. All adventure and combat spells are cast at expert proficiency.
    • Idol of Fortune. While giving either a morale or a luck bonus on most days, the
    • Idol will give both bonuses on the 7th day of the week.
    • Temple. The morale bonus bestowed on your troops by the Temple is doubled on the 7th day of the week. Note that it is always a good idea to conquer cities on the day just before the weekly creature growth because you will be able to replenish your forces on the very next day. The enhanced powers of Idols and Temples on the 7th day of the week are meant to encourage this strategy.
    • Spirit of Oppression. This artifact does not negate the morale bonus for wandering creatures on their native terrain, although it does negate this same bonus for creatures in a hero's army.
    • Hero Specialty. Some hero specialties come as bonuses to creatures, spells, or secondary skills. The descriptions of these specialties state that the bonus would increase "for every level after the nth" which is somewhat misleading. The bonus is not actually increased by a fixed amount of points per level. Its calculation merely involves the current hero level as a factor, as follows:
      • Creature Speciality. The indicated creatures gain a percentage bonus to their Attack and Defense ratings equivalent to (Hero Level / Creature Level) x 5. Note that since Creature Level appears as denominator, it can take a very long time before you see any effect at all on high level creatures! Therefore it's usually better to pick heroes that are specialized in lower level units. A hero on 20th level would give 50% bonus to skills of archers and 15% bonus to skills of Champions, if he had these two as special units. Practical effects will be pretty close, since the archers will get +3 to their attack of six, while the champions will get +2-3 to their attack of 16. (In other words, the lower percentage due to higher denominator in the formula is balanced by higher base numbers to which the counted percentage is applied.)
      • Spell Speciality. The effect of the indicated spell is enhanced by a percentage value of (Hero Level / Creature Level) x 3, where "Creature Level" is the level of the creature(s) targetted by the spell.
      • Skill Speciality. The effect of the indicated secondary skill is enhanced by a percentage value of Hero Level x 5. Note that this percentage value is multiplied with, rather than added to, the skill value: a 10th level hero with a Logistics specialty and Expert Logistics (30% movement bonus) receives an additional movement bonus of 15% (30% + 30% x (5% * 10) = 30% + 15% = 45%.
    • Army formation selector is used to determine the amount of free space between army stacks. It's functionality is very limited - the resulting formations are different for the two setting only for armies with 2-5 slots occupied - 1, 6 and 7 armies will always take the same position, no matter which setting you use.
    • Tactics selector can be used to switch off the usage of Tactics before the actual battle - on player side. This means that if your hero has Tactics skill and you don't want to use it most of the time, you switch it off here. Note that your hero will still 'use' his level of Tactics to eliminate enemy hero's Tactics, since Tactics of one hero eliminate the same amount of levels of Tactics of the other hero (i.e. you having Expert and enemy Advanced, then in the battle it looks like you have Basic and enemy has nothing...)
    • Archery skill applies to fortification fire, too! Therefore it's good to buy heroes with Archery for castle defenses (The castle usually does some 100 points of damage - and the increase comes pretty handy.)
    • Morale bonuses are individual for every unit in the battle - all units use the modifier that their hero gives them, but they get morale boost from their native terrain. Also, some creatures have specific morale rules. (Undead - no morale. Minotaurs - morale always at least at +1).
    • Special Abilities: (Briefly and incorrectly mentioned in the manual as "Make a Special Attack.") Some creatures have a special ability, such as the Archangel's ability to resurrect dead allied troops once per combat. When a stack of creatures with a special ability is active, just place the mouse cursor over whichever troop you want to use their ability on. Watch the mouse cursor change to an animated spell-casting icon and click to apply the special ability to the target troop. Using a special ability ends the turn for the performing creatures (but not for the target creatures).
      All special abilities except for the Mighty Gorgon's death stare and the Dendroid's entangling attack are affected by spell resistancies, immunities, and Cursed Ground in the same way as hero spells are. Alas, there is no indication whatsoever when a special attack fails due to such circumstances; the attack simply will not happen.
    • Two-hex creatures: Any ranged attack on a two-hex creature is considered hampered if any of the hexes occupied by the creature shows a "broken arrow" cursor. This is true even if the cursor turns into a whole arrow on the other hexagon. Also note that both hexes of the creature can take damage from some spells (notably FireWall - if the creature moves right through the firewall, it'll take double damage - for every hex one hit. Area damage spells such as fireball, however, don't have this effect.
    • Purchasing creatures: When you click on the castle/Citadel/Fort, you can not only see all stats, but by rightclicking display stats of the units and by left clicking purchase them.
  5. Building
  6. General rules on building are these:
    You should always start with money a town hall. However, since the town hall, built on day 1, earns you only 1000 gold extra on Day 1 next week, you might consider building creature dwellings and benefit structures instead.
    When upgrading dwellings, consider these points:

    • The most important upgrades are those that increase speed of your fastest unit. This applies especially to those cases where the fastest unit of your is rather slow. Getting an upgrade to speeds like 11 means that you'll at least sometimes get first shot in battles.
    • The second most important upgrades are those that increase speed of your slowest unit(s). They give you extra movement points for each turn.
    • The alternative second most important upgrades are those that give your units some specials. Why alternative second-most? It's good for a barbarian to move one step faster with his orc chiefs, yet it's comparatively good (yet in different ways) for gremlins to start throwing their balls or for vampires to start drinking blood of their enemies in large scale ;-)).
    • Upgrades to 7thlevel creatures are sometimes great change (i.e. behemoths)

    A Town Hall built on day one will result in extra 1000 gold on first day of next week (and then extra 3500 every first day of week). The build schedules shown in other articles build it, but there may be cases where you could build some highlevel dwelling at the cost of not building the Town Hall. So, before you build your first structure, try to guess whether you will be able (gold and resource-wise) to go for some high-end dwelling that quickly, and alter the building schedule accordingly.
    Always try to build most of dwellings before day one of next week. All miscellaneous buildings like Mage Guild, MarketPlace or Blacksmith can wait.
    The second week should be usually aimed at building money generators including Capitol, if possible.
    Ideal case: You start with two towns. One goes for the creatures and Citadel/Castle, the other goes for money in the first week. In the second week, the first town goes for capitol, while the second stays idle. Further on, all the cash that is left after troops are purchased should go to upgrades and also lower level dwellings in the second town.
    If you really don't know what to do with cash, build other structures. Short notes on all common structures:
    • Village hall: Default building, earns 500 g/t.
    • Town hall: First upgrade, earns 1000 g/t.
    • City hall: Second upgrade, earns 2000 g/t.
    • Capitol: Last upgrade, earns 4000 g/t, can be built only in one city per player.
    • Fort: Basic military structure - adds a wall to let shooters in your garrison defend the city. Also allows building of basic creature dwellings.
    • Citadel: Adds moat and arrow tower, increases base creature production by 50%, rounded DOWN!
    • Castle: Adds two arrow towers, stronger walls, increases base creature production by 100%.
    • Mage guilds need no big explanation, only remember that some town types are restricted in levels of mage guilds allowed.
    • Marketplaces generally come to work well in large numbers. [The more marketplaces you control, the better conversion rates you get. Three marketplaces get you the same rates as trading posts. With eight or more the rate drops to 1:2 in resources.]
    • Tavern is the place where you can recruit new heroes. Usually it's already there.
    • Class dependant buildings can be of use. Some help with defense, while others give some benefits to heroes. Check Way of particular castle for details.
  7. Defense strategy
  8. Get used to computer heroes attacking every now and then your castles. There are three main ways to chose:

    1. Take and forget - where you capture a castle and enjoy the gold it provides, learn it's spells, use it's benefits and leave. The enemy soon tries to recapture, but you've not helped him any with building the castle.
    2. Take and hold - where you sit in the captured castle and wait for the enemy to come. This isn't any good, since it binds your main force to that castle and also the enemy may attack a bit later, when he feels strong enough to beat you.
    3. Take and expand - where you stay near the castle with some decent armies (usually the main army, whose hero runs around to flag mines and get benefits (skill boosts etc.), while you build and upgrade dwellings, raise a castle to get maximum defensive power, buy another hero, run the benefits with him, too, recruit all troops and get ready to hold the castle with its own troops against enemy raids. With two weeks' production you can move your main hero away. This approach is pretty costly, but remember one thing: if you take a castle and let the enemy retake it, all the troops that were left inside unrecruited, will be probably recruited by the enemy and you'll face them later in the battle. And since castle defenses play first in the siege, this will usually mean more losses on your side. Of course, if you have 7thlevel troops only in the recapturing army, the towers won't score a kill and you'll gain experience for the troops you've killed.
      Always think first - is the castle worth holding at all costs? Are the troops inside strong enough to defend it? Or are they weak enough so you won't waste money on them and kill them later, when the enemy has paid for them? How strong enemy forces will try to take the castle?
      The AI's not stupid. It has learned much and will try it's best to strike on Day 7! So better be prepared.
      Defending the castles with newbie heroes has one good and one bad side. The bad side is higher losses, the good side is higher profits in gained levels, if you win.
      When defending ANY castle, think well about picking targets for defensive shooters. While the enemy may have relatively strong shooter armies, his walkers/flyers are usually a bigger threat - the shooters are out of effective range - they'll score half damage on you and you'll score half damage on them! So it's better to go after the walkers, especially those standing in the moat. Also, try to notice what units are targeted by your castle defenses and how much damage they suffer. For example, if your tower scores 60 damage and targets a stack with one archer left, it would be very wise to finish it, so the next 60 points from the tower will not be wasted on this one archer.
      Even if you're gonna lose the battle, fight to the end and try to inflict maximum losses, unless you want to surrender to save some precious troops.

十一月二十五日等待变化等待机会

这个是一个很小的问题,公司又要断电关机,我就需要把我的服务器虚拟机都关机,但是一个简单的循环总是不能正常工作:
cat serverList.txt | while read line; do /usr/bin/ssh -X -i /Users/nihuang/.ssh/id_rsa -o ConnectTimeout=3 -o BatchMode=yes root@${line} "nohup id >&- 2>&- <&- &" | echo $?  ; done
后来稍微修改了一个循环方式就可以了,不知道为什么?
nihuang@C02Y714BJGH5:~$ for host in $(cat serverList.txt); do /usr/bin/ssh -X -i /Users/nihuang/.ssh/id_rsa -o ConnectTimeout=3 -o BatchMode=yes root@${host} "nohup ip addr | grep ${host} &"  ; done
    inet 172.25.11.112/24 brd 172.25.11.255 scope global bond0
    inet 172.25.11.114/24 brd 172.25.11.255 scope global noprefixroute enp59s0f0
    inet 172.25.239.101/24 brd 172.25.239.255 scope global ens160
    inet 172.25.239.128/24 brd 172.25.239.255 scope global eth0
    inet 172.25.57.119/24 brd 172.25.57.255 scope global eth0
    inet 172.25.57.75/24 brd 172.25.57.255 scope global eth0
    inet 172.25.59.243/24 brd 172.25.59.255 scope global noprefixroute ens160
    inet 172.25.59.244/24 brd 172.25.59.255 scope global ens160
    inet 172.25.58.110/24 brd 172.25.58.255 scope global eth0
nihuang@C02Y714BJGH5:~$ 
nihuang@C02Y714BJGH5:~$ for host in $(cat serverList.txt); do /usr/bin/ssh -X -i /Users/nihuang/.ssh/id_rsa -o ConnectTimeout=3 -o BatchMode=yes root@${host} "nohup poweroff &"  ; done
ssh: connect to host 172.25.11.112 port 22: Operation timed out
ssh: connect to host 172.25.11.114 port 22: Operation timed out
ssh: connect to host 172.25.239.101 port 22: Operation timed out
ssh: connect to host 172.25.57.75 port 22: Operation timed out
ssh: connect to host 172.25.59.243 port 22: Operation timed out
ssh: connect to host 172.25.59.244 port 22: Operation timed out
nihuang@C02Y714BJGH5:~$ 
另一件是我再次证明了regex里的lazy和选择什么matching flag无关,这一点我还是不很理解,因为似乎哪里看到了Posix和perl还是什么的match是最长的,我的理解就是和lazy是一样的。By default, search will try to get longest match. i.e. the result is "abc"x"

                string str="\"abc\"x\"";
                regex ex("\".+\"");
                cmatch what;
                if (regex_search(str.c_str(), what, ex))
                {
					cout >> what[0] >> endl;
                }
If lazy, the result is "abc"

                string str="\"abc\"x\"";
                regex ex("\".+?\"");
                cmatch what;
                if (regex_search(str.c_str(), what, ex))
                {
					cout >> what[0] >> endl;
                }
我在论文的一个链接看到这个帖子果断决定收藏。
Heroes' Stats and Skills Chances

Here are the chances for different hero classes to gain a point in a certain primary skill on level up:

Hero Class__ Level_ Attack____ Defense___ Spell Power Knowledge
--------------------------------------------------------
Alchemist___ 2-9____30%________30%________20%________20%
Alchemist___ 10+____30%________30%________20%________20%
Barbarian___ 2-9____55%________35%________5%_________5%
Barbarian___ 10+____30%________30%________20%________20%
Battle Mage_ 2-9____30%________20%________25%________25%
Battle Mage_ 10+____25%________25%________25%________25%
Beastmaster_ 2-9____30%________60%________5%_________5%
Beastmaster_ 10+____30%________30%________20%________20%
Cleric______ 2-9____20%________15%________30%________35%
Cleric______ 10+____20%________20%________30%________30%
Death Knight 2-9____30%________25%________20%________25%
Death Knight 10+____25%________25%________25%________25%
Demoniac____ 2-9____30%________30%________20%________20%
Demoniac____ 10+____20%________20%________30%________30%
Druid_______ 2-9____10%________20%________35%________35%
Druid_______ 10+____20%________20%________30%________30%
Elementalist 2-9____10%________10%________50%________30%
Elementalist 10+____20%________20%________30%________30%
Heretic_____ 2-9____20%________10%________35%________35%
Heretic_____ 10+____25%________25%________25%________25%
Knight______ 2-9____40%________40%________10%________10%
Knight______ 10+____30%________30%________20%________20%
Necromancer_ 2-9____15%________15%________35%________35%
Necromancer_ 10+____25%________25%________25%________25%
Overlord____ 2-9____40%________35%________15%________10%
Overlord____ 10+____30%________30%________20%________20%
Planeswalker 2-9____50%________30%________10%________10%
Planeswalker 10+____30%________30%________20%________20%
Ranger______ 2-9____35%________45%________10%________10%
Ranger______ 10+____30%________30%________20%________20%
Warlock_____ 2-9____10%________10%________50%________30%
Warlock_____ 10+____20%________20%________30%________30%
Witch_______ 2-9____5%_________15%________40%________40%
Witch_______ 10+____20%________20%________30%________30%
Wizard______ 2-9____10%________10%________40%________40%
Wizard______ 10+____30%________30%________20%________20%

Here are the chances for different hero classes to have a certain secondary skill offered on level up:
Skill_______ Cleric__ Warlock_ Witch___ Heretic_ Necro___ Druid___ B. Mage_ Wizard___Elementalist
Air Magic___ 4________2________2________3________3________2________3________6________6
Archery_____ 3________2________3________4________2________5________4________2________2
Armorer_____ 3________1________4________4________2________3________4________1________1
Artillery___ 2________1________1________4________3________1________4________1________1
Ballistics__ 4________6________8________6________5________4________6________4________4
Diplomacy___ 7________4________2________3________4________4________3________4________4
Eagle Eye___ 6________8________10_______4________7________7________5________8________8
Earth Magic_ 3________5________3________4________8________4________3________3________6
Estates_____ 3________5________1________2________3________3________1________5________3
Fire Magic__ 2________3________3________5________2________1________3________2________6
First Aid___ 10_______6________8________5________0________7________4________7________4
Intelligence 6________8________7________6________6________7________5________10_______8
Leadership__ 2________3________1________2________0________2________4________4________3
Learning____ 4________4________4________4________4________4________4________4________4
Logistics___ 4________2________3________3________4________5________9________2________2
Luck________ 5________2________4________2________1________9________2________4________2
Mysticism___ 4________8________8________10_______6________6________4________8________8
Navigation__ 5________4________6________2________5________2________0________1________4
Necromancy__ 0________0________0________0________10_______0________0________0________0
Offense_____ 4________1________2________4________3________1________8________1________1
Pathfinding_ 2________2________2________4________6________5________4________2________2
Resistance__ 2________0________0________3________1________1________4________0________0
Scholar_____ 6________8________7________5________6________8________4________9________8
Scouting____ 3________2________2________3________2________2________4________2________2
Sorcery_____ 5________10_______8________6________6________6________6________8________8
Tactics_____ 2________1________1________4________2________1________5________1________1
Water Magic_ 4________2________3________2________3________4________3________3________6
Wisdom______ 7________10_______8________8________8________8________6________10_______8

Skill_______ Knight__ Overlord Beast___ Demon___ Death __ Ranger__ Barbarian Alchem. Planeswalker
Air Magic___ 3________1________1________2________2________1________3________4________2
Archery_____ 5________6________7________6________5________8________7________5________8
Armorer_____ 5________6________10_______7________5________8________6________8________5
Artillery___ 5________8________8________5________5________6________8________4________8
Ballistics__ 8________7________7________7________7________4________8________6________8
Diplomacy___ 4________3________1________4________2________4________1________3________2
Eagle Eye___ 2________2________1________3________4________2________2________3________2
Earth Magic_ 2________3________3________3________4________3________3________3________3
Estates_____ 6________4________1________3________0________2________2________4________3
Fire Magic__ 1________2________0________4________1________0________2________1________3
First Aid___ 2________1________6________2________0________3________1________2________1
Intelligence 1________1________1________2________5________2________1________4________1
Leadership__ 10_______8________5________3________0________6________5________3________3
Learning____ 4________4________4________4________4________4________4________10_______8
Logistics___ 5________8________8________10_______5________5________7________6________8
Luck________ 3________1________2________2________1________6________3________2________2
Mysticism___ 2________3________2________2________4________3________3________4________3
Navigation__ 8________4________8________4________8________3________2________3________5
Necromancy__ 0________0________0________0________10_______0________0________0________0
Offense_____ 7________8________5________8________7________5________10_______6________9
Pathfinding_ 4________5________8________4________4________7________8________4________6
Resistance__ 5________6________5________6________5________9________6________5________2
Scholar_____ 1________1________1________2________2________1________1________3________1
Scouting____ 4________5________7________5________4________7________8________4________6
Sorcery_____ 1________2________1________3________4________2________1________3________1
Tactics_____ 7________10_______6________6________5________5________8________4________8
Water Magic_ 4________0________2________1________3________3________0________2________2
Wisdom______ 3________3________2________4________6________3________2________6________2

Now... how does this thing work? Well, every time you level up the computer offers you to choose between a skill you already have and a new skill (or 2 new skills). The chance to get an existing skill is approximately (existing skill value)/(total of values of non-expert existing skills). The chance to get the new skill can be approximated by this formula: (value from the skill table)/(112 - total of values of the skills you already have).
for example, Ivor starts with archery and offense. The values for those skills add up to 13. The value of log skill for Ivor is 5.
So, the chance of Ivor getting logistics when he reaches level 2 is:
5/(112-13)=5.05%

*Note: Magic school skills and wisdom have some exceptions for might heroes. There may also be exceptions for the other skills.
If a might hero hasn't been offered a magic school yet, he will be offered one at level 4. If he turns it down a magic school will be offered again at level 8, if he keeps turning it down, he will also have one offered at levels 12, 16 and 20 (i.e. every 4-th level). If he gets offered a magic school on one of the other levels (see the chances in the random table above), the counter will be reset. Say, he is offered magic school at level 3. If he keeps turning it down and the counter doesn't get reset again, he will be offered it again on levels 7, 11, 15 and 19. The wisdom gets offered in a similar way. If you keep declining it and if the counter doesn't get reset, it will be offered on levels 6, 12 and 18.
Say, you want to know the chance of a certain hero getting earth magic before level 5. Instead of looking at earth magic skill values from the table, you need to look at the ratio of earth/total magic school values to get the chance of getting earth before level 5.

Using this info, here are the % chances to get earth for different might hero classes before they reach level 5:
Knights: 2/10 = 20%
Overlords: 3/6 = 50%
Beastmasters: 3/6 = 50%
Demoniacs: 3/10 = 30%
Death Knights: 4/10 = 40%
Rangers: 3/7 = 43%
Barbarians: 3/8 = 37.5%
Alchemists: 3/10 = 30%
Planeswalkers: 3/10 = 30%

Edit: edited some points based on posts by maretti and Binabik - thanks for your feedback, guys.

Edit by angelito
这里是我找到的类似的数据。
When a hero advances a level by means of battling or picking up a treasure chest, you are given the choice of two secondary skill advancements. Believe it or not, you will be offered skill choices (not only new ones, but to advance and expertise in a skill) 112 times before you cannot learn skills any further! These tables indicate how many times a given skill will be offered to that hero type, out of the 112 times.


MAGIC HERO SECONDARY SKILL ADVANCEMENT

SKILL NAME CLER WARL WITC HERE NECR DRUI BATT WIZA
Air Magic 4 2 2 3 3 2 4 5
Archery 3 2 3 4 2 5 4 2
Armorer 3 1 4 4 2 3 4 1
Artillery 2 1 2 4 3 1 4 1
Ballistics 4 6 4 6 5 4 6 4
Diplomacy 7 4 2 3 4 4 3 4
Eagle Eye 6 8 10 4 7 7 5 8
Earth Magic 3 4 4 4 5 4 3 4
Estates 3 5 1 1 3 3 1 5
Fire Magic 2 3 1 5 2 1 2 2
First Aid 10 3 1 5 0 7 4 7
Intelligence 6 8 7 6 7 7 5 10
Leadership 2 3 1 1 0 2 4 4
Learning 4 4 4 4 4 4 4 4
Logistics 4 2 3 3 4 5 10 2
Luck 5 2 4 2 1 10 2 4
Mysticism 4 8 8 10 6 6 4 8
Navigation 5 4 6 2 5 2 0 1
Necromancy 0 4 4 4 10 0 2 0
Offense 4 1 2 4 3 1 7 1
Pathfinding 2 2 2 3 6 5 4 2
Resistance 2 0 0 3 1 1 4 0
Scholar 6 8 7 5 6 8 4 9
Scouting 3 2 2 3 2 2 4 2
Sorcery 5 10 8 6 7 6 6 8
Tactics 2 1 2 4 2 1 5 1
Water Magic 4 1 3 2 4 3 1 3
Wisdom 7 10 8 7 8 8 6 10
Total 112 112 112 112 112 112 112 112

As you can see we have some restrictions on skills for many heroes. Let's summarize this:
Battle Mage - Restriction on Navigation.
Cleric - Cannot learn Necromancy.
Necromancer - Cannot learn First Aid or Leadership.
Druid - Cannot learn Necromancy.
Witch - Restricted from Learning Resistance.
Warlock - Also cannot learn Resistance.
Wizard - Cannot learn Necromancy or Resistance.
We can see that Necromancy is not a favorite among the ranks of magic heroes nor is resistance. Also notice that the Heretic hero class has absolutely no restrictions on any skill.


MIGHT HERO SECONDARY SKILL ADVANCEMENT

SKILL NAME KNIG OVER BEAS DEMO DEAT RANG BARB ALCH
Air Magic 3 1 1 2 2 1 3 4
Archery 5 6 7 6 5 8 7 5
Armorer 5 6 10 7 5 8 6 6
Artillery 5 8 8 5 5 6 8 4
Ballistics 8 7 6 6 7 4 8 6
Diplomacy 4 3 1 4 2 4 1 3
Eagle Eye 2 1 1 2 4 2 2 3
Earth Magic 2 3 3 3 4 3 3 3
Estates 6 4 1 3 0 2 2 4
Fire Magic 1 2 0 4 1 0 2 1
First Aid 2 1 6 1 0 3 1 2
Intelligence 1 1 1 2 5 2 1 4
Leadership 10 8 5 3 0 6 5 3
Learning 4 4 4 4 4 4 4 10
Logistics 5 8 8 10 5 5 7 6
Luck 3 1 2 2 1 6 3 1
Mysticism 2 3 2 3 4 3 3 4
Navigation 8 4 8 3 8 3 2 3
Necromancy 0 1 1 0 10 0 1 4
Offense 7 8 5 5 7 5 10 6
Pathfinding 4 5 8 4 4 7 8 4
Resistance 5 6 5 6 5 10 6 5
Scholar 1 1 1 2 2 1 1 3
Scouting 4 5 7 5 4 7 8 4
Sorcery 1 2 1 3 4 2 1 3
Tactics 7 10 6 6 5 5 7 4
Water Magic 4 0 2 1 3 2 0 2
Wisdom 3 3 2 4 6 3 2 5
Total 112 112 112 112 112 112 112 112
如果你不记得这些英雄的职业class,这里是我摘抄的列表。
Hero classes
Castle
 Knight
 Cleric
Rampart
 Ranger
 Druid
Tower
 Alchemist
 Wizard
Inferno
 Demoniac
 Heretic
Necropolis
 Death Knight
 Necromancer
Dungeon
 Overlord
 Warlock
Stronghold
 Barbarian
 Battle Mage
Fortress
 Beastmaster
 Witch
Conflux
 Planeswalker
 Elementalist
之前我就想把linux memory manager部分的文档收集一下。Mel Gorman's book(Most people have belief because they have vision; Very few people have vision because they have belief!)

十一月二十六日等待变化等待机会

关于micro-httpd的配置本来是一个很正常的问题,不知道为什么我办公室的电脑不正常。
service micro-httpd 
{
        disable         = no
        flags           = REUSE
        id              = micro-httpd
        type            = UNLISTED
        socket_type     = stream
        protocol        = tcp
        user            = nick
        wait            = no
        server          = /usr/sbin/micro-httpd 
        server_args     = /BigDisk/diabloforum/public_html/
        port            = 8080
}
闹了一个大笑话,我一直一位midnightcommander是一个纯粹的binaryreader,结果我在看到.h3m的地图文件后就想直接读文件,可是使用file命令又始终报告它是gzip文件,这个才让我不得不怀疑mc是懂得gzip格式的,的确如此,使用hexdump就可以看到文件头的确是gzip的magic码:注意这里说的是一个little-endian的integer,所以,byte order是8b1f。

十一月二十八日等待变化等待机会

昨天最后发现我配置microhttpd的方法并没有什么问题,而也许是公司防火墙的问题,同样类似的问题都出现在某些旧系统的webservice不能穿越防火墙的问题,也许吧。实际上这一切都来源于我的microhttpd不能正确的传送pdf等文件,我一开始以为是firefox的pdf插件的问题,很快就明白这个是microhttpd的mime设定问题,查看源代码可不是吗?号称一百行代码的webservice当然没有给你一个完全的mime list了。那么我自己添加这些mime如何呢?我做了一个列表:
struct MimeExt { std::string strExt; std::string strMime;};
struct MimeExt mimeExt[]={
{".3dm","x-world/x-3dmf"},
{".3dmf","x-world/x-3dmf"},
{".a","application/octet-stream"},
{".aab","application/x-authorware-bin"},
{".aam","application/x-authorware-map"},
{".aas","application/x-authorware-seg"},
{".abc","text/vnd.abc"},
{".acgi","text/html"},
{".afl","video/animaflex"},
{".ai","application/postscript"},
{".aif","audio/aiff"},
{".aif","audio/x-aiff"},
{".aifc","audio/aiff"},
{".aifc","audio/x-aiff"},
{".aiff","audio/aiff"},
{".aiff","audio/x-aiff"},
{".aim","application/x-aim"},
{".aip","text/x-audiosoft-intra"},
{".ani","application/x-navi-animation"},
{".aos","application/x-nokia-9000-communicator-add-on-software"},
{".aps","application/mime"},
{".arc","application/octet-stream"},
{".arj","application/arj"},
{".arj","application/octet-stream"},
{".art","image/x-jg"},
{".asf","video/x-ms-asf"},
{".asm","text/x-asm"},
{".asp","text/asp"},
{".asx","application/x-mplayer2"},
{".asx","video/x-ms-asf"},
{".asx","video/x-ms-asf-plugin"},
{".au","audio/basic"},
{".au","audio/x-au"},
{".avi","application/x-troff-msvideo"},
{".avi","video/avi"},
{".avi","video/msvideo"},
{".avi","video/x-msvideo"},
{".avs","video/avs-video"},
{".bcpio","application/x-bcpio"},
{".bin","application/mac-binary"},
{".bin","application/macbinary"},
{".bin","application/octet-stream"},
{".bin","application/x-binary"},
{".bin","application/x-macbinary"},
{".bm","image/bmp"},
{".bmp","image/bmp"},
{".bmp","image/x-windows-bmp"},
{".boo","application/book"},
{".book","application/book"},
{".boz","application/x-bzip2"},
{".bsh","application/x-bsh"},
{".bz","application/x-bzip"},
{".bz2","application/x-bzip2"},
{".c","text/plain"},
{".c","text/x-c"},
{".c++","text/plain"},
{".cat","application/vnd.ms-pki.seccat"},
{".cc","text/plain"},
{".cc","text/x-c"},
{".ccad","application/clariscad"},
{".cco","application/x-cocoa"},
{".cdf","application/cdf"},
{".cdf","application/x-cdf"},
{".cdf","application/x-netcdf"},
{".cer","application/pkix-cert"},
{".cer","application/x-x509-ca-cert"},
{".cha","application/x-chat"},
{".chat","application/x-chat"},
{".class","application/java"},
{".class","application/java-byte-code"},
{".class","application/x-java-class"},
{".com","application/octet-stream"},
{".com","text/plain"},
{".conf","text/plain"},
{".cpio","application/x-cpio"},
{".cpp","text/x-c"},
{".cpt","application/mac-compactpro"},
{".cpt","application/x-compactpro"},
{".cpt","application/x-cpt"},
{".crl","application/pkcs-crl"},
{".crl","application/pkix-crl"},
{".crt","application/pkix-cert"},
{".crt","application/x-x509-ca-cert"},
{".crt","application/x-x509-user-cert"},
{".csh","application/x-csh"},
{".csh","text/x-script.csh"},
{".css","application/x-pointplus"},
{".css","text/css"},
{".cxx","text/plain"},
{".dcr","application/x-director"},
{".deepv","application/x-deepv"},
{".def","text/plain"},
{".der","application/x-x509-ca-cert"},
{".dif","video/x-dv"},
{".dir","application/x-director"},
{".dl","video/dl"},
{".dl","video/x-dl"},
{".doc","application/msword"},
{".dot","application/msword"},
{".dp","application/commonground"},
{".drw","application/drafting"},
{".dump","application/octet-stream"},
{".dv","video/x-dv"},
{".dvi","application/x-dvi"},
{".dwf","drawing/x-dwf"},
{".dwf","model/vnd.dwf"},
{".dwg","application/acad"},
{".dwg","image/vnd.dwg"},
{".dwg","image/x-dwg"},
{".dxf","application/dxf"},
{".dxf","image/vnd.dwg"},
{".dxf","image/x-dwg"},
{".dxr","application/x-director"},
{".el","text/x-script.elisp"},
{".elc","application/x-bytecode.elisp"},
{".elc","application/x-elc"},
{".env","application/x-envoy"},
{".eps","application/postscript"},
{".es","application/x-esrehber"},
{".etx","text/x-setext"},
{".evy","application/envoy"},
{".evy","application/x-envoy"},
{".exe","application/octet-stream"},
{".f","text/plain"},
{".f","text/x-fortran"},
{".f77","text/x-fortran"},
{".f90","text/plain"},
{".f90","text/x-fortran"},
{".fdf","application/vnd.fdf"},
{".fif","application/fractals"},
{".fif","image/fif"},
{".fli","video/fli"},
{".fli","video/x-fli"},
{".flo","image/florian"},
{".flx","text/vnd.fmi.flexstor"},
{".fmf","video/x-atomic3d-feature"},
{".for","text/plain"},
{".for","text/x-fortran"},
{".fpx","image/vnd.fpx"},
{".fpx","image/vnd.net-fpx"},
{".frl","application/freeloader"},
{".funk","audio/make"},
{".g","text/plain"},
{".g3","image/g3fax"},
{".gif","image/gif"},
{".gl","video/gl"},
{".gl","video/x-gl"},
{".gsd","audio/x-gsm"},
{".gsm","audio/x-gsm"},
{".gsp","application/x-gsp"},
{".gss","application/x-gss"},
{".gtar","application/x-gtar"},
{".gz","application/x-compressed"},
{".gz","application/x-gzip"},
{".gzip","application/x-gzip"},
{".gzip","multipart/x-gzip"},
{".h","text/plain"},
{".h","text/x-h"},
{".hdf","application/x-hdf"},
{".help","application/x-helpfile"},
{".hgl","application/vnd.hp-hpgl"},
{".hh","text/plain"},
{".hh","text/x-h"},
{".hlb","text/x-script"},
{".hlp","application/hlp"},
{".hlp","application/x-helpfile"},
{".hlp","application/x-winhelp"},
{".hpg","application/vnd.hp-hpgl"},
{".hpgl","application/vnd.hp-hpgl"},
{".hqx","application/binhex"},
{".hqx","application/binhex4"},
{".hqx","application/mac-binhex"},
{".hqx","application/mac-binhex40"},
{".hqx","application/x-binhex40"},
{".hqx","application/x-mac-binhex40"},
{".hta","application/hta"},
{".htc","text/x-component"},
{".htm","text/html"},
{".html","text/html"},
{".htmls","text/html"},
{".htt","text/webviewhtml"},
{".htx","text/html"},
{".ice","x-conference/x-cooltalk"},
{".ico","image/x-icon"},
{".idc","text/plain"},
{".ief","image/ief"},
{".iefs","image/ief"},
{".iges","application/iges"},
{".iges","model/iges"},
{".igs","application/iges"},
{".igs","model/iges"},
{".ima","application/x-ima"},
{".imap","application/x-httpd-imap"},
{".inf","application/inf"},
{".ins","application/x-internett-signup"},
{".ip","application/x-ip2"},
{".isu","video/x-isvideo"},
{".it","audio/it"},
{".iv","application/x-inventor"},
{".ivr","i-world/i-vrml"},
{".ivy","application/x-livescreen"},
{".jam","audio/x-jam"},
{".jav","text/plain"},
{".jav","text/x-java-source"},
{".java","text/plain"},
{".java","text/x-java-source"},
{".jcm","application/x-java-commerce"},
{".jfif","image/jpeg"},
{".jfif","image/pjpeg"},
{".jfif-tbnl","image/jpeg"},
{".jpe","image/jpeg"},
{".jpe","image/pjpeg"},
{".jpeg","image/jpeg"},
{".jpeg","image/pjpeg"},
{".jpg","image/jpeg"},
{".jpg","image/pjpeg"},
{".jps","image/x-jps"},
{".js","application/x-javascript"},
{".js","application/javascript"},
{".js","application/ecmascript"},
{".js","text/javascript"},
{".js","text/ecmascript"},
{".jut","image/jutvision"},
{".kar","audio/midi"},
{".kar","music/x-karaoke"},
{".ksh","application/x-ksh"},
{".ksh","text/x-script.ksh"},
{".la","audio/nspaudio"},
{".la","audio/x-nspaudio"},
{".lam","audio/x-liveaudio"},
{".latex","application/x-latex"},
{".lha","application/lha"},
{".lha","application/octet-stream"},
{".lha","application/x-lha"},
{".lhx","application/octet-stream"},
{".list","text/plain"},
{".lma","audio/nspaudio"},
{".lma","audio/x-nspaudio"},
{".log","text/plain"},
{".lsp","application/x-lisp"},
{".lsp","text/x-script.lisp"},
{".lst","text/plain"},
{".lsx","text/x-la-asf"},
{".ltx","application/x-latex"},
{".lzh","application/octet-stream"},
{".lzh","application/x-lzh"},
{".lzx","application/lzx"},
{".lzx","application/octet-stream"},
{".lzx","application/x-lzx"},
{".m","text/plain"},
{".m","text/x-m"},
{".m1v","video/mpeg"},
{".m2a","audio/mpeg"},
{".m2v","video/mpeg"},
{".m3u","audio/x-mpequrl"},
{".man","application/x-troff-man"},
{".map","application/x-navimap"},
{".mar","text/plain"},
{".mbd","application/mbedlet"},
{".mc$","application/x-magic-cap-package-1.0"},
{".mcd","application/mcad"},
{".mcd","application/x-mathcad"},
{".mcf","image/vasa"},
{".mcf","text/mcf"},
{".mcp","application/netmc"},
{".me","application/x-troff-me"},
{".mht","message/rfc822"},
{".mhtml","message/rfc822"},
{".mid","application/x-midi"},
{".mid","audio/midi"},
{".mid","audio/x-mid"},
{".mid","audio/x-midi"},
{".mid","music/crescendo"},
{".mid","x-music/x-midi"},
{".midi","application/x-midi"},
{".midi","audio/midi"},
{".midi","audio/x-mid"},
{".midi","audio/x-midi"},
{".midi","music/crescendo"},
{".midi","x-music/x-midi"},
{".mif","application/x-frame"},
{".mif","application/x-mif"},
{".mime","message/rfc822"},
{".mime","www/mime"},
{".mjf","audio/x-vnd.audioexplosion.mjuicemediafile"},
{".mjpg","video/x-motion-jpeg"},
{".mm","application/base64"},
{".mm","application/x-meme"},
{".mme","application/base64"},
{".mod","audio/mod"},
{".mod","audio/x-mod"},
{".moov","video/quicktime"},
{".mov","video/quicktime"},
{".movie","video/x-sgi-movie"},
{".mp2","audio/mpeg"},
{".mp2","audio/x-mpeg"},
{".mp2","video/mpeg"},
{".mp2","video/x-mpeg"},
{".mp2","video/x-mpeq2a"},
{".mp3","audio/mpeg3"},
{".mp3","audio/x-mpeg-3"},
{".mp3","video/mpeg"},
{".mp3","video/x-mpeg"},
{".mpa","audio/mpeg"},
{".mpa","video/mpeg"},
{".mpc","application/x-project"},
{".mpe","video/mpeg"},
{".mpeg","video/mpeg"},
{".mpg","audio/mpeg"},
{".mpg","video/mpeg"},
{".mpga","audio/mpeg"},
{".mpp","application/vnd.ms-project"},
{".mpt","application/x-project"},
{".mpv","application/x-project"},
{".mpx","application/x-project"},
{".mrc","application/marc"},
{".ms","application/x-troff-ms"},
{".mv","video/x-sgi-movie"},
{".my","audio/make"},
{".mzz","application/x-vnd.audioexplosion.mzz"},
{".nap","image/naplps"},
{".naplps","image/naplps"},
{".nc","application/x-netcdf"},
{".ncm","application/vnd.nokia.configuration-message"},
{".nif","image/x-niff"},
{".niff","image/x-niff"},
{".nix","application/x-mix-transfer"},
{".nsc","application/x-conference"},
{".nvd","application/x-navidoc"},
{".o","application/octet-stream"},
{".oda","application/oda"},
{".omc","application/x-omc"},
{".omcd","application/x-omcdatamaker"},
{".omcr","application/x-omcregerator"},
{".p","text/x-pascal"},
{".p10","application/pkcs10"},
{".p10","application/x-pkcs10"},
{".p12","application/pkcs-12"},
{".p12","application/x-pkcs12"},
{".p7a","application/x-pkcs7-signature"},
{".p7c","application/pkcs7-mime"},
{".p7c","application/x-pkcs7-mime"},
{".p7m","application/pkcs7-mime"},
{".p7m","application/x-pkcs7-mime"},
{".p7r","application/x-pkcs7-certreqresp"},
{".p7s","application/pkcs7-signature"},
{".part","application/pro_eng"},
{".pas","text/pascal"},
{".pbm","image/x-portable-bitmap"},
{".pcl","application/vnd.hp-pcl"},
{".pcl","application/x-pcl"},
{".pct","image/x-pict"},
{".pcx","image/x-pcx"},
{".pdb","chemical/x-pdb"},
{".pdf","application/pdf"},
{".pfunk","audio/make"},
{".pfunk","audio/make.my.funk"},
{".pgm","image/x-portable-graymap"},
{".pgm","image/x-portable-greymap"},
{".pic","image/pict"},
{".pict","image/pict"},
{".pkg","application/x-newton-compatible-pkg"},
{".pko","application/vnd.ms-pki.pko"},
{".pl","text/plain"},
{".pl","text/x-script.perl"},
{".plx","application/x-pixclscript"},
{".pm","image/x-xpixmap"},
{".pm","text/x-script.perl-module"},
{".pm4","application/x-pagemaker"},
{".pm5","application/x-pagemaker"},
{".png","image/png"},
{".pnm","application/x-portable-anymap"},
{".pnm","image/x-portable-anymap"},
{".pot","application/mspowerpoint"},
{".pot","application/vnd.ms-powerpoint"},
{".pov","model/x-pov"},
{".ppa","application/vnd.ms-powerpoint"},
{".ppm","image/x-portable-pixmap"},
{".pps","application/mspowerpoint"},
{".pps","application/vnd.ms-powerpoint"},
{".ppt","application/mspowerpoint"},
{".ppt","application/powerpoint"},
{".ppt","application/vnd.ms-powerpoint"},
{".ppt","application/x-mspowerpoint"},
{".ppz","application/mspowerpoint"},
{".pre","application/x-freelance"},
{".prt","application/pro_eng"},
{".ps","application/postscript"},
{".psd","application/octet-stream"},
{".pvu","paleovu/x-pv"},
{".pwz","application/vnd.ms-powerpoint"},
{".py","text/x-script.phyton"},
{".pyc","application/x-bytecode.python"},
{".qcp","audio/vnd.qcelp"},
{".qd3","x-world/x-3dmf"},
{".qd3d","x-world/x-3dmf"},
{".qif","image/x-quicktime"},
{".qt","video/quicktime"},
{".qtc","video/x-qtc"},
{".qti","image/x-quicktime"},
{".qtif","image/x-quicktime"},
{".ra","audio/x-pn-realaudio"},
{".ra","audio/x-pn-realaudio-plugin"},
{".ra","audio/x-realaudio"},
{".ram","audio/x-pn-realaudio"},
{".ras","application/x-cmu-raster"},
{".ras","image/cmu-raster"},
{".ras","image/x-cmu-raster"},
{".rast","image/cmu-raster"},
{".rexx","text/x-script.rexx"},
{".rf","image/vnd.rn-realflash"},
{".rgb","image/x-rgb"},
{".rm","application/vnd.rn-realmedia"},
{".rm","audio/x-pn-realaudio"},
{".rmi","audio/mid"},
{".rmm","audio/x-pn-realaudio"},
{".rmp","audio/x-pn-realaudio"},
{".rmp","audio/x-pn-realaudio-plugin"},
{".rng","application/ringing-tones"},
{".rng","application/vnd.nokia.ringing-tone"},
{".rnx","application/vnd.rn-realplayer"},
{".roff","application/x-troff"},
{".rp","image/vnd.rn-realpix"},
{".rpm","audio/x-pn-realaudio-plugin"},
{".rt","text/richtext"},
{".rt","text/vnd.rn-realtext"},
{".rtf","application/rtf"},
{".rtf","application/x-rtf"},
{".rtf","text/richtext"},
{".rtx","application/rtf"},
{".rtx","text/richtext"},
{".rv","video/vnd.rn-realvideo"},
{".s","text/x-asm"},
{".s3m","audio/s3m"},
{".saveme","application/octet-stream"},
{".sbk","application/x-tbook"},
{".scm","application/x-lotusscreencam"},
{".scm","text/x-script.guile"},
{".scm","text/x-script.scheme"},
{".scm","video/x-scm"},
{".sdml","text/plain"},
{".sdp","application/sdp"},
{".sdp","application/x-sdp"},
{".sdr","application/sounder"},
{".sea","application/sea"},
{".sea","application/x-sea"},
{".set","application/set"},
{".sgm","text/sgml"},
{".sgm","text/x-sgml"},
{".sgml","text/sgml"},
{".sgml","text/x-sgml"},
{".sh","application/x-bsh"},
{".sh","application/x-sh"},
{".sh","application/x-shar"},
{".sh","text/x-script.sh"},
{".shar","application/x-bsh"},
{".shar","application/x-shar"},
{".shtml","text/html"},
{".shtml","text/x-server-parsed-html"},
{".sid","audio/x-psid"},
{".sit","application/x-sit"},
{".sit","application/x-stuffit"},
{".skd","application/x-koan"},
{".skm","application/x-koan"},
{".skp","application/x-koan"},
{".skt","application/x-koan"},
{".sl","application/x-seelogo"},
{".smi","application/smil"},
{".smil","application/smil"},
{".snd","audio/basic"},
{".snd","audio/x-adpcm"},
{".sol","application/solids"},
{".spc","application/x-pkcs7-certificates"},
{".spc","text/x-speech"},
{".spl","application/futuresplash"},
{".spr","application/x-sprite"},
{".sprite","application/x-sprite"},
{".src","application/x-wais-source"},
{".ssi","text/x-server-parsed-html"},
{".ssm","application/streamingmedia"},
{".sst","application/vnd.ms-pki.certstore"},
{".step","application/step"},
{".stl","application/sla"},
{".stl","application/vnd.ms-pki.stl"},
{".stl","application/x-navistyle"},
{".stp","application/step"},
{".sv4cpio","application/x-sv4cpio"},
{".sv4crc","application/x-sv4crc"},
{".svf","image/vnd.dwg"},
{".svf","image/x-dwg"},
{".svr","application/x-world"},
{".svr","x-world/x-svr"},
{".swf","application/x-shockwave-flash"},
{".t","application/x-troff"},
{".talk","text/x-speech"},
{".tar","application/x-tar"},
{".tbk","application/toolbook"},
{".tbk","application/x-tbook"},
{".tcl","application/x-tcl"},
{".tcl","text/x-script.tcl"},
{".tcsh","text/x-script.tcsh"},
{".tex","application/x-tex"},
{".texi","application/x-texinfo"},
{".texinfo","application/x-texinfo"},
{".text","application/plain"},
{".text","text/plain"},
{".tgz","application/gnutar"},
{".tgz","application/x-compressed"},
{".tif","image/tiff"},
{".tif","image/x-tiff"},
{".tiff","image/tiff"},
{".tiff","image/x-tiff"},
{".tr","application/x-troff"},
{".tsi","audio/tsp-audio"},
{".tsp","application/dsptype"},
{".tsp","audio/tsplayer"},
{".tsv","text/tab-separated-values"},
{".turbot","image/florian"},
{".txt","text/plain"},
{".uil","text/x-uil"},
{".uni","text/uri-list"},
{".unis","text/uri-list"},
{".unv","application/i-deas"},
{".uri","text/uri-list"},
{".uris","text/uri-list"},
{".ustar","application/x-ustar"},
{".ustar","multipart/x-ustar"},
{".uu","application/octet-stream"},
{".uu","text/x-uuencode"},
{".uue","text/x-uuencode"},
{".vcd","application/x-cdlink"},
{".vcs","text/x-vcalendar"},
{".vda","application/vda"},
{".vdo","video/vdo"},
{".vew","application/groupwise"},
{".viv","video/vivo"},
{".viv","video/vnd.vivo"},
{".vivo","video/vivo"},
{".vivo","video/vnd.vivo"},
{".vmd","application/vocaltec-media-desc"},
{".vmf","application/vocaltec-media-file"},
{".voc","audio/voc"},
{".voc","audio/x-voc"},
{".vos","video/vosaic"},
{".vox","audio/voxware"},
{".vqe","audio/x-twinvq-plugin"},
{".vqf","audio/x-twinvq"},
{".vql","audio/x-twinvq-plugin"},
{".vrml","application/x-vrml"},
{".vrml","model/vrml"},
{".vrml","x-world/x-vrml"},
{".vrt","x-world/x-vrt"},
{".vsd","application/x-visio"},
{".vst","application/x-visio"},
{".vsw","application/x-visio"},
{".w60","application/wordperfect6.0"},
{".w61","application/wordperfect6.1"},
{".w6w","application/msword"},
{".wav","audio/wav"},
{".wav","audio/x-wav"},
{".wb1","application/x-qpro"},
{".wbmp","image/vnd.wap.wbmp"},
{".web","application/vnd.xara"},
{".wiz","application/msword"},
{".wk1","application/x-123"},
{".wmf","windows/metafile"},
{".wml","text/vnd.wap.wml"},
{".wmlc","application/vnd.wap.wmlc"},
{".wmls","text/vnd.wap.wmlscript"},
{".wmlsc","application/vnd.wap.wmlscriptc"},
{".word","application/msword"},
{".wp","application/wordperfect"},
{".wp5","application/wordperfect"},
{".wp5","application/wordperfect6.0"},
{".wp6","application/wordperfect"},
{".wpd","application/wordperfect"},
{".wpd","application/x-wpwin"},
{".wq1","application/x-lotus"},
{".wri","application/mswrite"},
{".wri","application/x-wri"},
{".wrl","application/x-world"},
{".wrl","model/vrml"},
{".wrl","x-world/x-vrml"},
{".wrz","model/vrml"},
{".wrz","x-world/x-vrml"},
{".wsc","text/scriplet"},
{".wsrc","application/x-wais-source"},
{".wtk","application/x-wintalk"},
{".xbm","image/x-xbitmap"},
{".xbm","image/x-xbm"},
{".xbm","image/xbm"},
{".xdr","video/x-amt-demorun"},
{".xgz","xgl/drawing"},
{".xif","image/vnd.xiff"},
{".xl","application/excel"},
{".xla","application/excel"},
{".xla","application/x-excel"},
{".xla","application/x-msexcel"},
{".xlb","application/excel"},
{".xlb","application/vnd.ms-excel"},
{".xlb","application/x-excel"},
{".xlc","application/excel"},
{".xlc","application/vnd.ms-excel"},
{".xlc","application/x-excel"},
{".xld","application/excel"},
{".xld","application/x-excel"},
{".xlk","application/excel"},
{".xlk","application/x-excel"},
{".xll","application/excel"},
{".xll","application/vnd.ms-excel"},
{".xll","application/x-excel"},
{".xlm","application/excel"},
{".xlm","application/vnd.ms-excel"},
{".xlm","application/x-excel"},
{".xls","application/excel"},
{".xls","application/vnd.ms-excel"},
{".xls","application/x-excel"},
{".xls","application/x-msexcel"},
{".xlt","application/excel"},
{".xlt","application/x-excel"},
{".xlv","application/excel"},
{".xlv","application/x-excel"},
{".xlw","application/excel"},
{".xlw","application/vnd.ms-excel"},
{".xlw","application/x-excel"},
{".xlw","application/x-msexcel"},
{".xm","audio/xm"},
{".xml","application/xml"},
{".xml","text/xml"},
{".xmz","xgl/movie"},
{".xpix","application/x-vnd.ls-xpix"},
{".xpm","image/x-xpixmap"},
{".xpm","image/xpm"},
{".x-png","image/png"},
{".xsr","video/x-amt-showrun"},
{".xwd","image/x-xwd"},
{".xwd","image/x-xwindowdump"},
{".xyz","chemical/x-pdb"},
{".z","application/x-compress"},
{".z","application/x-compressed"},
{".zip","application/x-compressed"},
{".zip","application/x-zip-compressed"},
{".zip","application/zip"},
{".zip","multipart/x-zip"},
{".zoo","application/octet-stream"},
{".zsh","text/x-script.zsh"},
};
可是我傻傻的在感恩节当天去上班后改主意了,我觉得还是我之前想法使用file/magic来的好。
另一个大笑话是我前田晚上犯迷糊居然一晚上没有意识到我在使用bigendian的方式,问题是我在拷贝粘贴homm3的地图读法的代码时候不小心选择了bigendian的,就是把integer都要reverse一下的方式,结果一晚上都不明白为什么读取的结果都不对,其中还包含了我对于stringstream的使用方法的存疑,我不明白为什么使用read方法指针是正确移动的,而使用get方法似乎第一次正确移动了,后来都错误了。今天我开始再次研究怎么转化中文字串的问题,首先面临的是究竟那些中文使用的是什么编码的问题,这个其实就是考验基本概念的时候了,魔鬼在细节,基本上没有程序员敢说自己不明白unicode的,可是真的吗?我对于utf8和utf16都不甚了了,何况是中文的各种编码,究竟gb2312和utf8是什么关系呢?我至今没有一个明确的概念,这个说出来真的很令人沮丧。这里先摘录utf8和utf16的部分。一个很重要的utf8的优势是endian-neutral
because Unicode text encoded in UTF-8 is just a sequence of 8-bit byte units, there's no endianness complication. The UTF-8 encoding (unlike UTF-16) is endian-neutral by design. This is an important feature when exchanging text across different computing systems that can have different hardware architectures with different endianness.
微软之所以在这里提到utf16只不过是微软api使用的基本形式,这个应用已经不是主流了,所以没必要去使用,既生瑜何生亮?utf8有这么一个巨大的优势就足够了。

十二月一日等待变化等待机会

今天做了一个高难度的动作,太多的不知从何说起。我把这个存在这里先说一个小部分。就是makefile不显示执行的命令,这个帖子说的太好了
By default, make does print every command before executing it. This printing can be suppressed by one of the following mechanisms:
我在file的Makefile里看到AM_DEFAULT_VERBOSITY=0导致不打印命令,于是我改成了1就好了。有一个时期我修改Makefile.am导致configure引发了libtool的重新编译抱怨我的版本不存在,为了压制这个问题,当然是因为file这个package制作的时候使用比较老的版本的autotool之类的,于是touch -d "2 days ago" Makefile.am。我之所以在经历这一切是因为我想做一个非常小的改进,就是libmagic把默认的magic.mgc作为资源文件嵌入到binary文件中这样就不至于因为不同系统路径设置不同二找不到了。这个完全是反智的行为,因为之所以使用文件就是增加了灵活性,而我在作这样一个无理而无礼的愚蠢的动作,但是其中的过程却是超乎我的预期的复杂。首先我要修改makefile.am/makefile.in来把汇编文件加入。其实,我作弊直接把*.Plo直接加在了.deps目录下,因为我不清楚怎么产生这个。现在看来我只要把我的汇编文件blob.S加入到source里再重新configure产生。

十二月二日等待变化等待机会

在终于找到问题的源头后,我把我的发现过程夸赞为“一个变量地址的研究”以便作为对伟大的福尔摩斯探案方法的再一次致敬:在你排除一切的不可能之后,那么唯一的你认为的不可能就成为可能。 我仔细的检查编译的过程,看起来没有问题,排除了编译错误我开始怀疑代码的问题,可是看汇编的symbol似乎也没有头绪,我的看汇编的能力太差,需要再学习分析。
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I.. -DMAGIC=\"/etc/magic:/usr/local/share/misc/magic\" -fvisibility=hidden -Wall -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wmissing-declarations -Wredundant-decls -Wnested-externs -Wsign-compare -Wreturn-type -Wswitch -Wshadow -Wcast-qual -Wwrite-strings -Wextra -Wunused-parameter -Wformat=2 -g -O0 -MT blob.lo -MD -MP -MF .deps/blob.Tpo -c blob.S -o blob.o
libtool: link: ar cru .libs/libmagic.a  magic.o apprentice.o blob.o softmagic.o ascmagic.o encoding.o compress.o is_tar.o readelf.o print.o fsmagic.o funcs.o apptype.o der.o cdf.o cdf_time.o readcdf.o strlcpy.o strlcat.o fmtcheck.o
libtool: link: ranlib .libs/libmagic.a
libtool: link: gcc -fvisibility=hidden -Wall -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wmissing-declarations -Wredundant-decls -Wnested-externs -Wsign-compare -Wreturn-type -Wswitch -Wshadow -Wcast-qual -Wwrite-strings -Wextra -Wunused-parameter -Wformat=2 -g -O0 -g -O0 -o file file.o  ./.libs/libmagic.a -lz
查看汇编是这样子的:其中我怎么也想不明白.rodata里的偏移量00000000000273a8怎么也无法和代码中的地址0x1f9e0联系起来?

00000000000273a8 g       .rodata        0000000000000000              in_memory_blob
00000000000273a8 g       .rodata        0000000000000000              in_memory_blob

extern char* in_memory_blob;
extern int in_memory_blob_size;

private int mygetline(char **buf, size_t *bufsiz, size_t*pos)
{
    7982:       55                      push   %rbp
    7983:       48 89 e5                mov    %rsp,%rbp
    7986:       48 83 ec 30             sub    $0x30,%rsp
    798a:       48 89 7d e8             mov    %rdi,-0x18(%rbp)
    798e:       48 89 75 e0             mov    %rsi,-0x20(%rbp)
    7992:       48 89 55 d8             mov    %rdx,-0x28(%rbp)
        if (*pos >= in_memory_blob_size)
    7996:       48 8b 45 d8             mov    -0x28(%rbp),%rax
    799a:       48 8b 10                mov    (%rax),%rdx
    799d:       8b 05 45 b0 4d 00       mov    0x4db045(%rip),%eax        # 4e29e8 <n_memory_blob_size>
    79a3:       48 98                   cltq   
    79a5:       48 39 c2                cmp    %rax,%rdx
    79a8:       72 0a                   jb     79b4 <mygetline+0x32>
        {
                return -1;
    79aa:       b8 ff ff ff ff          mov    $0xffffffff,%eax
    79af:       e9 09 01 00 00          jmpq   7abd <mygetline+0x13b>
        }
        size_t newsize=0;
    79b4:       48 c7 45 f8 00 00 00    movq   $0x0,-0x8(%rbp)
    79bb:       00 
        while (*pos + newsize < in_memory_blob_size)
    79bc:       e9 d9 00 00 00          jmpq   7a9a <mygetline+0x118>
        {
                if (in_memory_blob[*pos + newsize++] == '\n')
    79c1:       48 8b 0d e0 f9 01 00    mov    0x1f9e0(%rip),%rcx        # 273a8 <in_memory_blob>
    79c8:       48 8b 45 d8             mov    -0x28(%rbp),%rax
经过一个上午的瞎捣鼓终于明白了。
  1. 编译器当然没有错,也没有什么section绝对地址相对地址的问题,比如mov 0x1f9e0 (%rip),%rcx # 273a8 <in_memory_blob>里地址是完全没有问题的,这里的0x1f9e0加上当前代码执行地址,也就是下一行的地址0x79c8就是.rodata里的偏移量0x273a8。
  2. 在gdb里我看到了真实的问题,就是这个全局变量in_memory_blob我把它定义为指针char*可是只有当我使用它的地址的时候我才能访问,或者说我只有使用&才能真正获得她所指向的偏移0x273a8(当然这个是不真实的因为在gdb里这些都要加上一个真正的基地址,不过我说的的确是事实因为可以从另一个变量in_memory_blob_size与之相比较看出来那个是实际的地址因为这个长度变量我始终访问没有问题。所以,结论是我对于汇编代码的指针类型的理解有误!
  3. 这个实在是embarassing,我用了这么多年的指针居然会犯这么个低级错误?回忆当年唯一写过汇编的大学作业里,的确声明一个字符串你只能说它变量类型是char,而没有什么char*这个概念,也就是说我声明的这个全局变量in_memory_blob类型必须是char。然后我在使用中必须使用他的地址来获得数组:
    
    	extern char in_memory_blob;
    	...
    	char* ptr = &in_memory_blob;
    		if (ptr[*pos + newsize++] == '\n')
    
    真的是老糊涂了啊!似乎我在我的测试例子里就没有这个问题,难道我这么多年代码都白写了?
    extern char blob[];
    extern int blob_size;
    int main(int argc, char** argv)
    {
    	char* buf = NULL;
    	buf = (char*)malloc(blob_size+1);
    	if (buf)
    	{
    		memcpy(buf, blob, blob_size);
    		buf[blob_size] = '\0';
    		printf("%s\n", buf);
    		free(buf);
    	}
    	return 0;
    }
    
  4. 随着这个问题的解决我才真正的遇到了核心问题就是我的算法有问题,这个代码有问题在于这里输入的文件似乎不对,也就是说期待的是未处理的文件?这个才是真正的核心,而我花了整整一天半时间在细枝末节上。这才遇到了核心的算法问题,需要回去看代码才行。

十二月二日等待变化等待机会

经过一整天的迷惑,我终于意识到我是bark the wrong tree!完全瞄准了错误的目标,但是前两天的工作并不是毫无作用,没有对于汇编字符串的研究结果我肯定会陷入更多的迷潭之中。

Smiley face