技术 | 我的站点

Git clone命令笔记

git clone命令用来拷贝一个已经存在的代码仓库。Clone操作会自动创建一个指向原始代码仓库的远程连接（名字是origin），这会很方便操作中央代码库。git的合作开发模式是基于“代码仓库-代码仓库”（repository-to-repository），不同于SVN的从工作拷贝向中央代码仓库提交代码的方式，git的push或pull操作都是从一个代码仓库到另一个代码仓库。

使用：

a）git clone <repo>
把位于repo的代码仓库复制到本机。

b）git clone <repo> <directory>
把位于repo的代码仓库复制到本机的directory。

例子：

[root@CentOS ~]# git clone https://github.com/sharklinux/shark
Cloning into 'shark'...
remote: Counting objects: 1003, done.
remote: Total 1003 (delta 0), reused 0 (delta 0), pack-reused 1003
Receiving objects: 100% (1003/1003), 21.43 MiB | 304.00 KiB/s, done.
Resolving deltas: 100% (245/245), done.
[root@CentOS ~]# ls
anaconda-ks.cfg  shark

执行git clone https://github.com/sharklinux/shark在本机初始化一个shark文件夹（注意没有.git后缀，表明这是一个非bare属性的本地拷贝），文件夹里包含了整个shark代码仓库的所有内容。

参考资料：
git clone。

解决“Warning: Cannot modify header information – headers already sent by …”的问题

中午变化了一下博客主题，结果就登不上了。提示错误如下：

Warning: Cannot modify header information - headers already sent by (output started at /home/to/public_html/en/wp-content/themes/nordby/functions.php:78) in /home/to/public_html/en/wp-login.php on line 418

Warning: Cannot modify header information - headers already sent by (output started at /home/to/public_html/en/wp-content/themes/nordby/functions.php:78) in /home/to/public_html/en/wp-login.php on line 431

问题出在/home/to/public_html/en/wp-content/themes/nordby/functions.php这个文件上。这是错误的文件截图，可以看到结尾有空白行：

修改成这样就可以了：

参考资料：
Cannot modify header information – headers already sent by …。

Git init命令笔记

git init命令用来创建一个新的Git仓库。它既可以用来完全初始化一个新的空仓库，也可以把一个已经存在的，没有版本控制的仓库转成Git仓库。执行git init命令会在指定工程的根目录下创建一个.git的子文件夹。除了.git子文件夹，工程的其它文件都不会改变。

使用：

a）git init
把当前目录变成一个Git仓库。

b）git init <directory>
在指定的目录下创建Git仓库。执行这个命令将会创建一个叫directory的新文件夹，在这个文件夹里只有.git子文件夹。

c）git init --bare <directory>
初始化一个没有工作文件夹的空的Git仓库。用来共享的Git仓库应该始终使用--bare选项来创建。通常情况下，用--bare选项初始化的仓库以.git作为后缀。举个例子，使用--bare选项创建的project仓库应该叫project.git。

比较一下git init <directory>和git init --bare <directory>：
首先执行git init linux:

[root@CentOS ~]# git init linux
Initialized empty Git repository in /root/linux/.git/
[root@CentOS ~]# ls -alt linux/
total 8
dr-xr-x---. 5 root root 4096 Jun  2 12:53 ..
drwxr-xr-x. 7 root root 4096 Jun  2 12:42 .git
drwxr-xr-x. 3 root root   17 Jun  2 12:42 .
[root@CentOS ~]# ls -alt linux/.git
total 20
drwxr-xr-x. 7 root root 4096 Jun  2 12:42 .
drwxr-xr-x. 4 root root   28 Jun  2 12:42 objects
-rw-r--r--. 1 root root   92 Jun  2 12:42 config
-rw-r--r--. 1 root root   23 Jun  2 12:42 HEAD
drwxr-xr-x. 2 root root   20 Jun  2 12:42 info
drwxr-xr-x. 2 root root 4096 Jun  2 12:42 hooks
-rw-r--r--. 1 root root   73 Jun  2 12:42 description
drwxr-xr-x. 2 root root    6 Jun  2 12:42 branches
drwxr-xr-x. 3 root root   17 Jun  2 12:42 ..
drwxr-xr-x. 4 root root   29 Jun  2 12:42 refs

接着执行git init --bare bsd:

[root@CentOS ~]# git init --bare bsd
Initialized empty Git repository in /root/bsd/
[root@CentOS ~]# ls -lt bsd
total 16
drwxr-xr-x. 4 root root   28 Jun  2 13:01 objects
-rw-r--r--. 1 root root   66 Jun  2 13:01 config
drwxr-xr-x. 2 root root    6 Jun  2 13:01 branches
-rw-r--r--. 1 root root   73 Jun  2 13:01 description
-rw-r--r--. 1 root root   23 Jun  2 13:01 HEAD
drwxr-xr-x. 2 root root 4096 Jun  2 13:01 hooks
drwxr-xr-x. 2 root root   20 Jun  2 13:01 info
drwxr-xr-x. 4 root root   29 Jun  2 13:01 refs

可以看到所有的文件信息都直接创建在bsd目录下，而没有创建在.git文件夹下。

参考文档：
git init

解决screen会话闪烁的问题

使用GNU screen创建多个工作会话时，使用时可能会遇到会话屏幕闪烁的问题。比如在命令行使用backspace把所有字符都删除完以后，或是man某个命令翻到最后一页还接着往下翻，等等。原因是visual bell在捣鬼。解决方法如下：

（1）编辑“~/.screenrc”文件；
（2）加入以下行：

vbell_msg "bell: window ~%"     # Message for visual bell
vbellwait 2                     # Seconds to pause the screen for visual bell
vbell off                       # Turns visual bell off

参考文档：
http://stackoverflow.com/questions/897358/gnu-screen-refresh-problem。

Lttng安装简介

本文参考自Lttng的官方文档。

（1）安装Lttng：
我使用的是CentOS，所以按照RHEL的文档，使用yum方式安装：
a）构建package相关信息：

wget -P /etc/yum.repos.d/ http://packages.efficios.com/repo.files/EfficiOS-RHEL7-x86-64.repo
rpmkeys --import http://packages.efficios.com/rhel/repo.key
yum updateinfo

b）接下来安装lttng软件包：

yum install lttng-ust-devel #安装 lttng-ust会同时安装liburcu0
yum install kmod-lttng-modules
yum install lttng-tools-devel
yum install babeltrace-devel

（2）测试一下：

lttng create my-session
lttng enable-event --kernel --all
lttng start
lttng stop
lttng stop
lttng destroy

接着执行ls命令：

[root@CentOS ~]# ls
anaconda-ks.cfg  lttng-traces

可以看到抓的trace都在lttng-traces这个文件夹里。

使用Intellij搭建Scala开发环境

在网上找到使用Intellij搭建Scala开发环境的官方文档，发现老的已经掉牙。索性自己写了一篇：Getting Started with Scala in IntelliJ IDEA 14.1，以给需要的朋友一个参考。

搭建Scala开发环境

本文以CentOS 7为例，介绍如何搭建Scala开发环境：

（1）安装Scala :
执行“yum install scala”命令：

[root@localhost ~]# yum install scala
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: centos.mirrors.tds.net
 * extras: bay.uchicago.edu
 * updates: dallas.tx.mirror.xygenhosting.com
Nothing to do

提示找不到Scala安装包，所以采用另外方式。使用wget命令直接下载：

[root@localhost ~]# wget http://downloads.typesafe.com/scala/2.11.6/scala-2.11.6.rpm
--2015-05-27 22:07:32--  http://downloads.typesafe.com/scala/2.11.6/scala-2.11.6.rpm
......
Length: 111919675 (107M) [application/octet-stream]
Saving to: ‘scala-2.11.6.rpm’

100%[=========================================================================>] 111,919,675  298KB/s   in 6m 15s

2015-05-27 22:13:48 (291 KB/s) - ‘scala-2.11.6.rpm’ saved [111919675/111919675]

接下来安装Scala：

[root@localhost ~]# rpm -ivh scala-2.11.6.rpm
Preparing...                          ################################# [100%]
Updating / installing...
   1:scala-2.11.6-0                   ################################# [100%]

安装成功，执行Scala：

[root@localhost ~]# scala
/usr/bin/scala: line 23: java: command not found

运行Scala需要JRE支持，所以下一步安装Java环境。

（2）执行yum install java：

[root@localhost ~]# yum install java
......
Complete!

（3）运行scala，打印“Hello world!”：

[root@localhost ~]# scala
Welcome to Scala version 2.11.6 (OpenJDK 64-Bit Server VM, Java 1.8.0_45).
Type in expressions to have them evaluated.
Type :help for more information.

scala> print("Hello world!")
Hello world!
scala> :quit

安装成功！

利用Spark API写一个单独的程序

本文参考Spark网站的Self-Contained Applications一节，使用Scala语言开发一个单独的小程序。

（1）首先安装sbt，参考官方文档。我使用的是RPM包格式：

curl https://bintray.com/sbt/rpm/rpm | sudo tee /etc/yum.repos.d/bintray-sbt-rpm.repo
sudo yum install sbt

（2）接下来在/home文件夹下建立一个SparkApp的文件夹，文件夹布局如下：

bash-4.1# find /home/SparkApp/
/home/SparkApp/
/home/SparkApp/simple.sbt
/home/SparkApp/src
/home/SparkApp/src/main
/home/SparkApp/src/main/scala
/home/SparkApp/src/main/scala/SimpleApp.scala

其中simple.sbt文件内容如下所示：

name := "Simple Project"

version := "1.0"

scalaVersion := "2.10.4"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.0"

SimpleApp.scala程序如下：

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object SimpleApp {
  def main(args: Array[String]) {
    val logFile = "file:///usr/local/spark/README.md" // Should be some file on your system
    val conf = new SparkConf().setAppName("Simple Application")
    val sc = new SparkContext(conf)
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
  }
}

（3）执行sbt package命令打包jar文件：

bash-4.1# sbt package
......
[success] Total time: 89 s, completed May 25, 2015 10:16:51 PM

（4）调用spark-submit脚本执行程序：

bash-4.1# /usr/local/spark/bin/spark-submit --class "SimpleApp" --master local[4] target/scala-2.10/simple-project_2.10-1.0.jar
......
Lines with a: 60, Lines with b: 29

可以看到，输出正确结果。

搭建Spark开发环境

本文使用docker搭建Spark环境，使用的image文件是sequenceiq提供的1.3.0版本。

首先pull Spark image文件：

docker pull sequenceiq/spark:1.3.0

pull成功后，运行Spark：

docker run -i -t -h sandbox sequenceiq/spark:1.3.0 bash

测试Spark是否工作正常：

bash-4.1# spark-shell --master yarn-client --driver-memory 1g --executor-memory 1g --executor-cores 1
......
scala> sc.parallelize(1 to 1000).count()
......
res0: Long = 1000

输出1000，OK！

（1）启动spark-shell，输出log很多，解决方法如下：
a）把/usr/local/spark/conf文件夹下的log4j.properties.template文件复制生成一份log4j.properties文件：

bash-4.1# cd /usr/local/spark/conf
bash-4.1# cp log4j.properties.template log4j.properties

b）把log4j.properties文件里的“log4j.rootCategory=INFO, console”改成“log4j.rootCategory=WARN, console”即可。

（2）启动spark-shell会有以下warning：

15/05/25 04:49:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

提示找不到hadoop的库文件，解决办法如下：

export LD_LIBRARY_PATH=/usr/local/hadoop/lib/native/:$LD_LIBRARY_PATH

请参考stackoverflow的相关讨论：
a）Hadoop “Unable to load native-hadoop library for your platform” error on CentOS；
b）Hadoop “Unable to load native-hadoop library for your platform” error on docker-spark?。

（3）在Quick Start中提到如下例子：

scala> val textFile = sc.textFile("README.md")
......
scala> textFile.count() // Number of items in this RDD

执行会有错误：

scala> textFile.count()
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://sandbox:9000/user/root/README.md
        at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285)
        at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
        at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:304)
        at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:203)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)

可以看到程序尝试从hdfs中寻找文件，所以报错。

解决方法有两种：
a）指定本地文件系统：

scala> val textFile = sc.textFile("file:///usr/local/spark/README.md")
textFile: org.apache.spark.rdd.RDD[String] = file:///usr/local/spark/README.md MapPartitionsRDD[3] at textFile at <console>:21

scala> textFile.count()
res1: Long = 98

b）上传文件到hdfs上：

bash-4.1# hadoop fs -put /usr/local/spark/README.md README.md

接着运行spark-shell:

bash-4.1# spark-shell --master yarn-client --driver-memory 1g --executor-memory 1g --executor-cores 1
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.3.0
      /_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_51)
Type in expressions to have them evaluated.
Type :help for more information.
15/05/25 05:22:15 WARN Client: SPARK_JAR detected in the system environment. This variable has been deprecated in favor of the spark.yarn.jar configuration variable.
15/05/25 05:22:15 WARN Client: SPARK_JAR detected in the system environment. This variable has been deprecated in favor of the spark.yarn.jar configuration variable.
Spark context available as sc.
SQL context available as sqlContext.

scala> val textFile = sc.textFile("README.md")
textFile: org.apache.spark.rdd.RDD[String] = README.md MapPartitionsRDD[1] at textFile at <console>:21

scala> textFile.count()
res0: Long = 98

参考邮件：
Spark Quick Start – call to open README.md needs explicit fs prefix。

P.S.在主机（非docker环境）下载spark（https://spark.apache.org/downloads.html）运行时，会有以下warning：

log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

解决办法是把/path/to/spark/conf文件夹下的log4j.properties.template文件复制生成一份log4j.properties文件即可。

参考stackoverflow的讨论：
log4j:WARN No appenders could be found for logger (running jar file, not web app)。

不要招聘“性价比”高的工程师

以前在论坛看到一个帖子，发帖者抱怨自己招不到“性价比”高的工程师。当时没想什么，不过我后来仔细琢磨一下，感觉有点奇怪。为什么一个公司非要招聘“性价比”高的工程师呢？

所谓“性价比”，顾名思义，就是“性能”和“价格”的比值。“性能”越高，“价格”越低，性价比就越高。如果一个人每月工资是5000元，但他每个月为公司创造的价值相当于工资是8000或10000元的人创造的，那么这个人“性价比”就很高了。但是且慢，这样是不是对这个人不公平呢？当然，人就应该拿到和他能力匹配的工资。为什么公司非要招聘“性价比”高的工程师呢？“性价比”是1，或者性能和价格匹配就好了。谁都不是傻子，都知道自己有几斤几两。当他发现工资与自己的能力和价值相差太远时，或是降低工作效率，或是准备拍屁股走人，这样对公司有什么好处呢？开始以为招聘“性价比”高的工程师占了便宜，其实长远来看，还是吃亏的。据说有些公司还把招聘时压价格作为对招聘者的一个考评指标，想想真是可笑。

不要考虑招聘“性价比”高的工程师的了，招聘一个“性价比”合适的工程师。

2025 年 6 月
一	二	三	四	五	六	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30