admin

该用户没有分享资料


admin

linux关于sort命令的高级用法(按多个列值进行排列)

如果单纯地使用sort按行进行排序比较简单,

但是使用sort按多个列值排列,同时使用tab作为分隔符,而且对于某些列需要进行逆序排列,这样sort命令写起来就比较麻烦了

比如下面的文件内容,使用[TAB]进行分割:

Group-ID   Category-ID   Text        Frequency
----------------------------------------------
200        1000          oranges     10
200        900           bananas     5
200        1000          pears       8
200        1000          lemons      10
200        900           figs        4
190        700           grapes      17

下面使用这些列进行排序(列4在列3之前进行排序,而且列4是逆序排列)

    * Group ID (integer)
    * Category ID (integer)
    * Frequency “sorted in reverse order” (integer)
    * Text (alpha-numeric)

排序后的结果应该为:

Group-ID   Category-ID   Text        Frequency
----------------------------------------------
190        700           grapes      17
200        900           bananas     5
200        900           figs        4
200        1000          lemons      10
200        1000          oranges     10
200        1000          pears       8

可以直接使用sort命令来解决这个问题:

sort -t $'\t' -k 1n,1 -k 2n,2 -k4rn,4 -k3,3 <my-file>

解释如下:

-t $'\t':指定TAB为分隔符
-k 1, 1: 按照第一列的值进行排序,如果只有一个1的话,相当于告诉sort从第一列开始直接到行尾排列
n:代表是数字顺序,默认情况下市字典序,如10<2
r: reverse 逆序排列,默认情况下市正序排列

所以最后的命令:sort -t $'\t' -k 1n,1 -k 2n,2 -k4rn,4 -k3,3 my-file

参考资料:

register、volatile、restrict 三关键字的用法[转载]

原文地址:register、volatile、restrict 三关键字的用法 – RaymondAmos的技术专栏 – CSDN博客.

register

使用修饰符register声明的变量属于寄存器存储类型。该类型与自动存储类型相似,具有自动存储时期、代码块作用域和内连接。声明为register 仅仅是一个请求,因此该变量仍然可能是普通的自动变量。无论哪种情况,用register修饰的变量都无法获取地址。如果没有被初始化,它的值是未定的。

volatile
volatile告诉编译器该被变量除了可被程序修改外,还可能被其他代理、线程修改。因此,当使用volatile 声明的变量的值的时候,系统总是重新从它所在的内存读取数据,而不使用寄存器中的缓存的值。比如,

val1=x;
val2=x;

如 果没有声明volatile,系统在给val2赋值的时候可能直接从寄存器读取x,而不是从内存的初始位置读取。那么在两次赋值之间,x完全有可能被被某 些编译器未知的因素更改(比如:操作系统、硬件或者其它线程等)。如果声明为volatile,编译器将不使用缓存,而是每次都从内存重新读取x。
restrict

restrict是c99引入的,它只可以用于限定指针,并表明指针是访问一个数据对象的唯一且初始的方式,考虑下面的例子:

int ar[10];
int * restrict restar=(int *)malloc(10*sizeof(int));
int *par=ar;

这里说明restar是访问由malloc()分配的内存的唯一且初始的方式。par就不是了。那么:

for(n=0;n<10;n++)
{
    par[n]+=5;
    restar[n]+=5;
    ar[n]*=2;
    par[n]+=3;
    restar[n]+=3;
}

因 为restar是访问分配的内存的唯一且初始的方式,那么编译器可以将上述对restar的操作进行优化:restar[n]+=8;。而par并不是访 问数组ar的唯一方式,因此并不能进行下面的优化:par[n]+=8;。因为在par[n]+=3前,ar[n]*=2进行了改变。使用了关键字 restric,编译器就可以放心地进行优化了。这个关键字据说来源于古老的FORTRAN。

总结

两个关键字:volatile和restrict,两者都是为了方便编译器的优化。

今天EMC笔试题目

两道的题目:

1. dup(int fd)和dup2(int fd1, int fd2)函数的区别:详细请见http://blog.donews.com/mutecat/archive/2007/09/20/1212178.aspx

2. 有关一致性哈希的算法设计

其他的是 多项选择题目,涉及的范围比较广,linux和语言层次的题目偏多一些吧

包括堆栈缓冲区溢出以及linux内核中container_of宏的实现,以及spinlock和虚拟内存的相关知识

关于container_of这个请参考我的前一篇blog: http://yaronspace.cn/blog/index.php/archives/1026

debian系统sshd连接慢的解决办法

最近客户端连debian下sshd很慢,每次都需要等待半分钟左右,今天终于无法忍受了,上网找了下资料,把这个问题解决了

原因:

主要是debian在默认情况下开启了dns的反向解析,这个比较耗时

解决方法:

1. 编辑/etc/nsswitch.conf 找到hosts行,替换为

hosts: files dns [NOTFOUND=return]

2. 查看/etc/resolv.conf文件,查看dns地址是否设置正确,如果没有用,直接注掉即可

3. 重启sshd : 先kill掉,然后/usr/sbin/sshd 启功即可

linux下多线程编程的几篇不错的博文

1. 多线程服务器的常用编程模型

2. 多线程服务器的适用场合

3. 并发编程的 15 条建议(译)

4. C++ 多线程系统编程精要

选自陈硕的blog : http://blog.csdn.net/solstice

查看 CPU, Memory, I/O and NetFlow[转载]

原文地址:http://blogread.cn/it/article.php?id=3908&f=sa

iostat 查看磁盘 I/O

[root@localhost ~]# iostat -d -x 2
                             extended device statistics
device mgr/s mgw/s    r/s    w/s    kr/s    kw/s   size queue   wait svc_t  %b
hda        0     0    0.0    0.9     0.1     5.4    6.3   0.0    4.7   0.9   0
                             extended device statistics
device mgr/s mgw/s    r/s    w/s    kr/s    kw/s   size queue   wait svc_t  %b
hda        0     3    0.0    2.0     0.0    20.0   10.0   0.0    0.8   0.5   0
 ......

命令释意: 查看磁盘 I/0 情况,且每两秒刷新一次

[root@localhost ~]# vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0    284  68700 165876 416748    0    0     0     5    1    1  0  0 100  0  0
 ......

命令释意: 查看CPU使用情况的命令, 每 5 秒刷新一次,最右侧列为 CPU 的占用率的数据

top 查看进程占有率

[root@localhost ~]# top

然后在 top 的命令内部命令栏输入shift+p or P

top - 13:38:52 up 102 days,  4:17,  1 user,  load average: 0.00, 0.00, 0.00
Tasks:  81 total,   2 running,  79 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1035292k total,   966592k used,    68700k free,   165876k buffers
Swap:  2096472k total,      284k used,  2096188k free,   416760k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 4875 root      15   0  2192 1000  800 R  0.3  0.1   0:00.15 top
    1 root      15   0  2060  620  532 S  0.0  0.1   0:01.65 init
    2 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/0
.......

命令释意: 查看进程处理器占用率情况,并对其进行排序

free 查看内存使用情况

[root@localhost ~]# free
             total       used       free     shared    buffers     cached
Mem:       1035292     966592      68700          0     165876     416768
-/+ buffers/cache:     383948     651344
Swap:      2096472        284    2096188

命令释意: 查看内存使用情况

top 查看进程内存使用

[root@localhost ~]# top

然后在 top 的命令内部命令栏输入shift+m or M

top - 13:48:52 up 102 days,  4:27,  1 user,  load average: 0.00, 0.01, 0.00
Tasks:  81 total,   2 running,  79 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1035292k total,   966592k used,    68700k free,   165876k buffers
Swap:  2096472k total,      284k used,  2096188k free,   416784k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 4128 root      34  19  280m 266m 2112 S  0.0 26.4   1:48.40 yum-updatesd
 8314 www       18   0  171m  34m  34m S  0.0  3.4   0:06.03 memcacheq
 8280 www       15   0 88084  34m  588 S  0.0  3.4   0:01.23 memcached
10907 mysql     15   0  122m  16m 3892 S  0.0  1.7   0:25.29 mysqld
......

命令释意: 查看进程内存占用率情况,并对其进行排序
更多 >

[编程之美]随机数范围扩展方法总结

问题描述

已知random3()这个随机数产生器生成[1, 3]范围的随机数,请用random3()构造random5()函数,生成[1, 5]的随机数?

问题分析

如何从[1-3]范围的数构造更大范围的数呢?同时满足这个更大范围的数出现概率是相同的,可以想到的运算包括两种:加法和乘法

考虑下面的表达式:

3 * (random3() – 1) + random3();

可以计算得到上述表达式的范围是[1, 9]  而且数的出现概率是相同的,即1/9

下面考虑如何从[1, 9]范围的数生成[1, 5]的数呢?

可以想到的方法就是 rejection sampling 方法,即生成[1, 9]的随机数,如果数的范围不在[1, 5]内,则重新取样

解决方法

int random5()
{
    int val = 0;
    do {
        val = 3 * (random3() - 1) + random3();
    } while (val > 5);
    return val;
}

归纳总结

将这个问题进一步抽象,已知random_m()随机数生成器的范围是[1, m] 求random_n()生成[1, n]范围的函数,m < n && n <= m *m
一般解法:

int random_n()
{
    int val = 0 ;
    int t; // t为n最大倍数,且满足 t &lt;= m * m     
    do {         
        val = m * (random_m() - 1) + random_m();     
    } while (val > t);
    return val;
}

参考资料:

http://stackoverflow.com/questions/137783/expand-a-random-range-from-1-5-to-1-7

推荐firefox插件vimperator (用vim方式使用Firefox)

今天在逛水木的时候,发现这篇文章:http://www.newsmth.net/bbsrecon.php?id=8017

是对vimperator进行了介绍,原来国外哥们写的firefox插件,能够像使用Vim的方式来高效地使用firefox,下来来尝试了,效果不错!

下载地址:https://addons.mozilla.org/en-US/firefox/addon/vimperator/

常用命令(不断更新):

open: 在当前tab打开新的网址  open www.baidu.com

tabopen: 在新的tab打开网址

back: 后退键

forward:前进键

gt/gT:在tab间进行移动

d: 关闭当前tab

hjkl:上下移动网页或者光标

i(insert):进入insert模式,可以移动光标

/ :搜索网页的内容

总的来说很强大哈!!!

MySQL索引背后的数据结构及算法原理【好文】

这篇文章从mysql索引的内部结构B+树来分析如何来提高索引的性能,以及索引如何进行存储等方面,

写的比较通俗易懂,推荐!

地址:http://www.cnblogs.com/leoo2sk/archive/2011/07/10/mysql-index.html

140个Google的面试题【转载】

原文地址:http://coolshell.cn/articles/3345.html

来源:http://blog.seattleinterviewcoach.com/2009/02/140-google-interview-questions.html(墙)

某猎头收集了140多个Google的面试题,都张到他的Blog中了,主要是下面这些职位的,因为被墙,且无任何敏感信息,所以,我原文搬过来了。
  • Product Marketing Manager
  • Product Manager
  • Software Engineer
  • Software Engineer in Test
  • Quantitative Compensation Analyst
  • Engineering Manager
  • AdWords Associate

这篇Blog例举了Google用来面试下面这几个职位的面试题。很多不是很容易回答,不过都比较经典与变态,是 Google,Microsoft,Amazon之类的公司的风格。对于本文,我没有翻译,因为我相信,英文问题是最好的。不过对于有些问题,我做了一些 注释,不一定对,但希望对你有帮助启发。对于一些问题,如果你百思不得其解,可以Google一下,StackOverflow或是Wikipedia上 可能会给你非常全面的答案。

Product Marketing Manager
  • Why do you want to join Google?
  • What do you know about Google’s product and technology?
  • If you are Product Manager for Google’s Adwords, how do you plan to market this?
  • What would you say during an AdWords or AdSense product seminar?
  • Who are Google’s competitors, and how does Google compete with them?
  • Have you ever used Google’s products? Gmail?
  • What’s a creative way of marketing Google’s brand name and product?
  • If you are the product marketing manager for Google’s Gmail product, how do you plan to market it so as to achieve 100 million customers in 6 months?
  • How much money you think Google makes daily from Gmail ads?
  • Name a piece of technology you’ve read about recently. Now tell me your own creative execution for an ad for that product.
  • Say an advertiser makes $0.10 every time someone clicks on their ad. Only 20% of people who visit the site click on their ad. How many people need to visit the site for the advertiser to make $20?
  • Estimate the number of students who are college seniors, attend four-year schools, and graduate with a job in the United States every year.
Product Manager
  • How would you boost the GMail subscription base?
  • What is the most efficient way to sort a million integers?  (陈皓:merge sort)
  • How would you re-position Google’s offerings to counteract competitive threats from Microsoft?
  • How many golf balls can fit in a school bus? (陈皓:这种题一般来说是考你的解题思路的,注意,你不能单纯地把高尔夫球当成一个小立方体,其是一个圆球,堆起来的时候应该是错开的——也就是三个相邻 的球的圆心是个等边三角形)
  • You are shrunk to the height of a nickel and your mass is proportionally reduced so as to maintain your original density. You are then thrown into an empty glass blender. The blades will start moving in 60 seconds. What do you do?
  • How much should you charge to wash all the windows in Seattle?
  • How would you find out if a machine’s stack grows up or down in memory?
  • Explain a database in three sentences to your eight-year-old nephew. (陈皓:用三句话向8岁的侄子解释什么是数据库,考你的表达能力了)
  • How many times a day does a clock’s hands overlap?(陈皓:经典的时钟问题)
  • You have to get from point A to point B. You don’t know if you can get there. What would you do?
  • Imagine you have a closet full of shirts. It’s very hard to find a shirt. So what can you do to organize your shirts for easy retrieval? (陈皓:很不错的一道题,不要以为分类查询很容易,想想图书馆图书的分类查询问题吧。另外,你处想想如何在你在你的衣柜里实现一个相当于Hash表或是一 个Tree之类的数据结构)
  • Every man in a village of 100 married couples has cheated on his wife. Every wife in the village instantly knows when a man other than her husband has cheated, but does not know when her own husband has. The village has a law that does not allow for adultery. Any wife who can prove that her husband is unfaithful must kill him that very day. The women of the village would never disobey this law. One day, the queen of the village visits and announces that at least one husband has been unfaithful. What happens? (陈皓:这个问题很有限制级,哈哈,非常搞的一个问题,注意wife们的递归,这类的问题是经典的分布式通讯问题,上网搜 一搜吧。)
  • In a country in which people only want boys, every family continues to have children until they have a boy. If they have a girl, they have another child. If they have a boy, they stop. What is the proportion of boys to girls in the country?(陈皓:第一反应是——这个国家是中国。一个概率问题,其实,无论你怎么生,50%的概率是永远不变的。)
  • If the probability of observing a car in 30 minutes on a highway is 0.95, what is the probability of observing a car in 10 minutes (assuming constant default probability)?
  • If you look at a clock and the time is 3:15, what is the angle between the hour and the minute hands? (The answer to this is not zero!)
  • Four people need to cross a rickety rope bridge to get back to their camp at night. Unfortunately, they only have one flashlight and it only has enough light left for seventeen minutes. The bridge is too dangerous to cross without a flashlight, and it’s only strong enough to support two people at any given time. Each of the campers walks at a different speed. One can cross the bridge in 1 minute, another in 2 minutes, the third in 5 minutes, and the slow poke takes 10 minutes to cross. How do the campers make it across in 17 minutes?(陈皓:经典的过桥问题)
  • You are at a party with a friend and 10 people are present including you and the friend. your friend makes you a wager that for every person you find that has the same birthday as you, you get $1; for every person he finds that does not have the same birthday as you, he gets $2. would you accept the wager?
  • How many piano tuners are there in the entire world?
  • You have eight balls all of the same size. 7 of them weigh the same, and one of them weighs slightly more. How can you find the ball that is heavier by using a balance and only two weighings?(陈皓:经典的称重问题。这样的问题花样很多,不过都不难回答)
  • You have five pirates, ranked from 5 to 1 in descending order. The top pirate has the right to propose how 100 gold coins should be divided among them. But the others get to vote on his plan, and if fewer than half agree with him, he gets killed. How should he allocate the gold in order to maximize his share but live to enjoy it? (Hint: One pirate ends up with 98 percent of the gold.)
  • You are given 2 eggs. You have access to a 100-story building. Eggs can be very hard or very fragile means it may break if dropped from the first floor or may not even break if dropped from 100th floor. Both eggs are identical. You need to figure out the highest floor of a 100-story building an egg can be dropped without breaking. The question is how many drops you need to make. You are allowed to break 2 eggs in the process. (陈皓:从3的倍数的楼层开始扔,比如3,6,9,12…..,如果鸡蛋在3n层碎了,那到在3n-1层扔第二个鸡蛋,如果没碎,则最高不碎楼层为3n- 1,否则为3n-2)
  • Describe a technical problem you had and how you solved it.
  • How would you design a simple search engine?
  • Design an evacuation plan for San Francisco.
  • There’s a latency problem in South Africa. Diagnose it. (陈皓:这个问题完全是在考你的解决问题的能力。没有明确的答案。不过,解决性能问题的第一步通常是找出瓶颈,找瓶颈有很多种方法,工具,二分查,时间记 录等等。)
  • What are three long term challenges facing Google?
  • Name three non-Google websites that you visit often and like. What do you like about the user interface and design? Choose one of the three sites and comment on what new feature or project you would work on. How would you design it?
  • If there is only one elevator in the building, how would you change the design? How about if there are only two elevators in the building? (陈皓:经典的电梯设计问题,这种问题千变万化,主要是考你的设计能力和需求变化的适变能力,与此相似的是酒店订房系统。)
  • How many vacuum’s are made per year in USA?