进程和线程的区别

进程

线程

进程和线程的联系

进程和线程的区别

What is the diffrence between a process and a thread Here is the analogy I use in Linux Kernel Development. Processes are the abstraction of running programs: A binary image, virtualized memory, various kernel resources, an associated security context, and so on. Threads are the unit of execution in a process: A virtualized processor, a stack, and program state. Put another way, processes are running binaries and threads are the smallest unit of execution schedulable by an operating system’s process scheduler.

A process contains one or more threads. In single-threaded processes, the process contains one thread. You can say the thread is the process—there is one thing going on. In multithreaded processes, the process contains more than one thread—there’s more than one thing going on.

The two primary virtualized abstractions in modern operating systems are virtualized memory and a virtualized processor. Both afford the illusion to running processes that they alone consume the machine’s resources. Virtualized memory gives processes a unique view of memory that seamlessly maps back to physical RAM or on-disk storage (swap space). A virtualized processor lets processes act as if they alone run on a processor, when in fact multiple processes are multitasking across multiple processors.

Virtualized memory is associated with the process and not the thread. Thus, threads share one memory address space. Conversely, a distinct virtualized processor is associated with each thread. Each thread is an independent schedulable entity.

What’s the point? We obviously need processes. But why introduce the separate concept of a thread and allow multithreaded processes? There are four primary benefits to multithreading:

Programming abstraction. Dividing up work and assigning each division to a unit of execution (a thread) is a natural approach to many problems. Programming patterns that utilize this approach include the reactor, thread-per-connection, and thread pool patterns. Some, however, view threads as an anti-pattern. The inimitable Alan Cox summed this up well with the quote, “threads are for people who can’t program state machines.” Parallelism. In machines with multiple processors, threads provide an efficient way to achieve true parallelism. As each thread receives its own virtualized processor and is an independently-schedulable entity, multiple threads may run on multiple processors at the same time, improving a system’s throughput. To the extent that threads are used to achieve parallelism—that is, there are no more threads than processors—the “threads are for people who can’t program state machines” quote does not apply. Blocking I/O. Without threads, blocking I/O halts the whole process. This can be detrimental to both throughput and latency. In a multithreaded process, individual threads may block, waiting on I/O, while other threads make forward progress. Asynchronous & non-blocking I/O are alternative solutions to threads for this issue. Memory savings. Threads provide an efficient way to share memory yet utilize multiple units of execution. In this manner they are an alternative to multiple processes.

The cost of these benefits of threading are increased complexity in the form of needing to manage concurrency through mechanisms such as mutexes and condition variables. Given the growing trend toward processors sporting multiple cores and systems sporting multiple processors, threading is only going to become a more important tool in system programming.

使用 Go 常犯的错误,以及如何避免它们

1. Not Accepting Interfaces

Go 语言的数据类型 可以表达状态和行为,状态即数据类型的内部数据结构,行为是数据类型拥有的方法。 接口是 Go 语言的最强大的功能之一, 它实现了类型的可扩展性.一组方法定义一个接口。数据类型只要实现了接口的所有的行为,就称它满足接口。

bytes.Buffer 是可变长度的字节缓冲区,有 Read 和 Write 方法。

错误的写法

func (page *Page) saveSourceAs(path string) {
		b := new(bytes.Buffer)
		b.Write(page.Source.Content)
		page.saveSource(b.Bytes(), path)
}

func (page *Page) saveSource(by []byte, inpath string) {
	WriteToDisk(inpath, bytes.NewReader(by))
}

正确的写法

func (page *Page) saveSourceAs(path string) {
		b := new(bytes.Buffer)
		b.Write(page.Source.Content)
		page.saveSource(b), path)
}

func (page *Page) saveSource(b io.Reader, inpath string) {
	WriteToDisk(inpath, b)
}

2. Not Using io.Reader & io.Writer

这两个接口的好处:

  • Simple & flexible interfaces,for many operations around input and output
  • Provides access to a huge wealth of functionality
  • Keeps operations extensible

两个接口的定义:

type Reader interface {
	Read(p []byte) (n int, err error)
}

type Writer interface {
	Write(p []byte) (n int, err error)
}

错误的写法:

 func (v *Viper) ReadBufConfig(buf *bytes.Buffer) error {
	 v.config = make(map[string]interface{})
	 v.marshalReader(buf, v.config)
	 return nil
 }

正确的写法

func (v *Viper) ReadBufConfig(in io.Reader) error {
	 v.config = make(map[string]interface{})
	 v.marshalReader(in, v.config)
	 return nil
}

3.Requiring Broad interfaces

  • 函数应该只接受足它需要的方法的接口
  • 函数不应该接受 a broad interface when a narrow one would work
  • Compose broad interfaces made from narrower ones

错误的写法

func ReadIn(f File) {
	b := []byte{}
	n, err := f.Read(b)
	...
}

正确的写法

func ReadIn(r Reader) {
	b := []byte{}
	n, err := r.Read(b)
	...
}

4. Methods Vs Functions

有面向对象开发经验的开发者容易过度滥用对象方法,用结构体和方法来定义一切。

What is A function ?

  • 对输入N1进行操作后输出N2
  • 同样的输入总是产生同样的输出
  • 函数不应该依赖状态

What is A Method ?

  • 定义一个类型的行为
  • 一个类型值操作的函数
  • 应该使用状态
  • Logicallly connected

函数与方法之间的关系

  • 函数能够接受接口作为输入,也能和接口一起使用
  • 方法和一个具体的类型绑定在一起

正确的写法

func extracShortcodes(s string, p *Page, pt Template) (string, map[string]shortcode, error) {
	...
	for {
		switch currItem.typ {
		...
		case tError:
			err := fmt.Errorf("%s:%d: %s", p.BaseFileName(), (p.lineBNumRawContentStart()+pt.lexer.lineNum()-1), currItem)
		}
	}
}

5. Pointer Vs Values

当你需要共享一个值,就使用指针,否则使用 a value (copy) 如果你想通过方法共享一个值时,使用指针接收者,因为方法通常管理状态,但这样不是并发安全的

如果一个类型是空结构体(无状态,只有行为), 那么使用只使用值,这样是并发安全的

使用指针接收者的例子

type InMemoryFile struct {
	at int64
	name string
	data []byte
	closed bool
}

func (f *InMemoryFile) Close() error {
	atomic.StoreInt64(&f.at, 0)
	f.closed = true
	return nil
}

使用值接收者的例子

type Time struct {
	sec int64
	nsec uintptr
	loc *Location
}

func (t Time) IsZero() bool {
	return t.sec == 0 && t.nsec == 0
}

6. Thinking of errors As strings

error 是一个接口,它的行为就是返回错误信息的字符串。 公开的error变量更容易被检查。

type error interface {
	Error() string
}

错误的写法

func NewPage(name string) (p *Page, err error) {
	if len(name) == 0 {
		retrun nil ,errors.New("Zero length page name")
	}
}

正确的写法

var ErrNoName = erros.New("Zero length page name")

func NewPage(name string) (*Page, error) {
	if len(name) == 0 {
		return nil, ErrNoName
	}
}

func Foo(name string) (error) {
	err := NewPage("bar")
	if err == ErrNoName {
		newPage("default")
	} else {
		log.FatalF(err)
	}
}

通过定制化 error 接口,可以扩张已有的 error 接口,

  • 提供上下文
  • 提供一个和 error 值比较的类型
  • 基于内部错误状态来提供动态的值

例子

type Error struct {
	Code ErrorCode
	Message string
	Detail interface{}
}

func (e Error) Error() string {
	return fmt.Sprintf("%s: %s", strings.ToLower(strings.Replace(e.Code.String()),"_"," ", -1), e.Message)
}
type PathError struct {
	Op string
	Path string
	Err error
}
func (e *PathError) Error() string {
	return e.Op + " " + e.Path + ": " + e.Err.Error()
}

7. Maps Are Not Safe

并发写Map,程序可能会Panic,可以使用互斥锁来保护Map

func (m *MMFS) Create(name string) (File, error) {
	m.lock()
	m.getData()[name] = MemFileCreate(name)
	m.unlock()
	m.registerDirs(m.getData())[name]
	return m.getData()[name],nil
}

8. 有些值是不可比较的

  • map、slice ,function 是不可以互相比较的,所以也不能作为 map 的 key 的类型
  • struct 当且仅当 它的所有字段可以比较时,才可以比较。
  • channel 当且仅当 channel 的类型是相同时,才可以比较。
  • interface 值可以比较,当它们的动态类型是一样的,动态值是相等的时,以及两个 interface 都为 nil 时,两个interface 值 相等
  • 引用类型(map,slice,channel,function,pointer)和 interface 都可以和 nil 做比较

9. 注意 make 和 new 的区别

  • make 只能分配:slice、map、channel 类型的对象的内存并初始化
  • new 返回的值是一个指向该类型零值的指针
type Foo map[string]string

x := make(Foo)
x["ping"] = "pong"    // ok

y := new(Foo)
(*y)["ping"] = "pong" // panic!

我犯过的 Go 语言错误

sync.Mutex.lock 和 sync.Mutex.unlock

mutex.lock() 尝试获取互斥锁,如果锁已被使用,goroutine 会阻塞直到获得互斥锁。 mutex.unlock() 则将已被使用的锁释放, 如果有的goroutine 没有释放获得的互斥锁,则会造成其他goroutine 无法获得互斥锁,从而造成死锁。

  • 错误的用法:
if isfetch := safemap.v[url]; isfetch {
		fmt.Println(url," is fetched")
		return
} else {
		safemap.v[url] = true
}
safemap.Unlock()
  • 正确的用法:
if isfetch := safemap.v[url]; isfetch {
		fmt.Println(url," is fetched")
		// 此处容易忘记解锁
		safemap.Unlock()
		return
} else {
		safemap.v[url] = true
		safemap.Unlock()
}

sync.WaitGroup

WaitGroup 用于等待一组 goroutine 结束,主 goroutine 调用 Add() 设置要等待的 goroutine 的数目。 每个 goroutine 结束时调用 Done()。同时主 goroutine 调用 Wait() ,阻塞主 goroutine 知道 所有的 goroutine 结束。 第一次使用 WatiGroup 实例后, 该 WaitGroup 一定不能被拷贝。更多的信息可以从不能被拷贝的结构 中获得。

WaitGroup 是结构体,不是引用类型,所以传递给 goroutine 时不能直接传值,而要传递 WaitGroup 实例的指针.

  • 错误的用法:
package main

import "sync"

func Crawl(url string, depth int, fetcher Fetcher, waitgroup sync.WaitGroup) {
	...
}

func main() {
	wg := sync.WaitGroup{}
	wg.Add(1)
	go Crawl("http://golang.org/", 4, fetcher, wg)
	wg.Wait()
}

使用 go vet 可以检测到这个错误

MacBookPro:crawler zhuqiuzhi$ go vet crewler.go
crewler.go:24: Crawl passes lock by value: sync.WaitGroup contains sync.noCopy
crewler.go:49: call of Crawl copies lock value: sync.WaitGroup contains sync.noCopy
crewler.go:58: call of Crawl copies lock value: sync.WaitGroup contains sync.noCopy
exit status 1
  • 正确的用法:
package main

import "sync"

func Crawl(url string, depth int, fetcher Fetcher, waitgroup *sync.WaitGroup) {
	...
}

func main() {
	wg := sync.WaitGroup{}
	wg.Add(1)
	go Crawl("http://golang.org/", 4, fetcher, &wg)
	wg.Wait()
}

Redis 的数据类型和应用

Redis 是 Key-Values 数据库.它的速度非常快,适合写很重、数据变化频繁、数据适合Redis内部数据类型的应用。 但它不适合存储只有一部分热点数据的大数据集,或者数据集不适合存在内存中。

No-SQL 数据库一般不提供 ACID(atomicity,consistency,isolation,durability),或者部分提供它提供了部分的 ACID。 Redis 实现了部分的ACID:

  • 单线程,保证了一致性和独立性(isolation)
  • Full compliance (如果配置了 appendfync)
  • Durability

String

字符串是 Redis 中最简单的数据类型,能存储任何种类的字符串,包括二进制数据。 需要记住的是字符串值的最大长度是 512MB。string 可以用于存储用户访问计数器,假如将网站的所有网页都有一个唯一的 pageid, 则网页的访问次数可以用 visits:pageid:total 表示. redis 提供了 incr 和 decr 来递增或者递减 value 保存的值,incrby 和 decrby 来增加或者减少指定key的values。

string 常用命令

127.0.0.1:6379> set visits:1:totals 1233
OK
127.0.0.1:6379> get visits:1:totals 
"1233"
127.0.0.1:6379> incr visits:1:totals
(integer) 1234
127.0.0.1:6379> decr visits:1:totals
(integer) 1233
127.0.0.1:6379> incrby visits:1:totals 1
(integer) 1234
127.0.0.1:6379> decrby visits:1:totals 1
(integer) 1233

Hash

Hash 特别适合存储应用使用的对象数据. 假如要保存用户类型对象alias的数据,则 key 可以为 users:alias. 通过 hset 可以指定保存对象aalias 的字段的值。

Hash 常用命令

127.0.0.1:6379> hset users:joy name "John Doe"
(integer) 1
127.0.0.1:6379> hset users:joy email "jdoe@test.com"
(integer) 1
127.0.0.1:6379> hget users:joy name
"John Doe"
127.0.0.1:6379> hget users:joy email
"jdoe@test.com"
127.0.0.1:6379> hgetall users:joy
1) "name"
2) "John Doe"
3) "email"
4) "jdoe@test.com"
127.0.0.1:6379> hkeys users:joy
1) "name"
2) "email"
127.0.0.1:6379> hvals users:joy
1) "John Doe"
2) "jdoe@test.com"

Set

无序集合是数据的集合,不能有重复的元素。它适合保存Google+ 中的朋友圈或者兴趣小组的用户集合。Redis的 sinter 能查询两个集合的交集, sunion 查询两个集合的并集。

Set 常用命令

127.0.0.1:6379> sadd cicrle:jdoe:family users:anana
(integer) 1
127.0.0.1:6379> sadd cicrle:jdoe:family users:mike
(integer) 1
127.0.0.1:6379> sadd cicrle:jdoe:family users:richard
(integer) 1
127.0.0.1:6379> smembers cicrle:jdoe:family
1) "users:mike"
2) "users:richard"
3) "users:anana"
127.0.0.1:6379> sadd cicrle:jdoe:soccer mike
(integer) 1
127.0.0.1:6379> sadd cicrle:jdoe:soccer users:mike
(integer) 1
127.0.0.1:6379> sadd cicrle:jdoe:soccer users:adam
(integer) 1
127.0.0.1:6379> smembers cicrle:jdoe:soccer
1) "users:mike"
2) "users:adam"
127.0.0.1:6379> sinter cicrle:jdoe:soccer cicrle:jdoe:family
1) "users:mike"
127.0.0.1:6379> sunion cicrle:jdoe:soccer cicrle:jdoe:family
1) "users:anana"
2) "users:richard"
3) "users:mike"
4) "users:adam"

等价二叉树

原理

不同二叉树的叶节点上可以保存相同的值序列。例如,以下两个二叉树都保存了序列 1,1,2,3,5,8,13 。

二叉树

可以使用递归来将遍历二叉树,通过 channel 来传递每个节点的值,然后比较依次比较每个节点的值,

  • 如果长度不一致,则二叉树不等价。
  • 如果有节点的值不相等,则二叉树不等价。
  • 否则,二叉树等价

代码实现

package main

import (
	"fmt"
	"golang.org/x/tour/tree"
)

// Walk 步进 tree t 将所有的值从 tree 发送到 channel ch。
func WalkImpl(t *tree.Tree, ch chan int) {
	if t == nil {
		return
	}
	WalkImpl(t.Left, ch)
	ch <- t.Value
	WalkImpl(t.Right, ch)
}

func Walk(t *tree.Tree, ch chan int) {
	WalkImpl(t, ch)
	close(ch)
}

// Same 检测树 t1 和 t2 是否含有相同的值。
func Same(t1, t2 *tree.Tree) bool {
	ch1, ch2 := make(chan int), make(chan int)
	go Walk(t1, ch1)
	go Walk(t2, ch2)
	for {
		v1, ok1 := <-ch1
		v2, ok2 := <-ch2
		if !ok1 || !ok2 {
			return ok1 == ok2
		}
		if v1 != v2 {
			return false
		}
	}
}

func main() {
	fmt.Print("tree.New(1) == tree.New(1): ")
	if Same(tree.New(1), tree.New(1)) {
		fmt.Println("PASSED")
	} else {
		fmt.Println("FAILED")
	}

	fmt.Print("tree.New(1) != tree.New(2): ")
	if !Same(tree.New(1), tree.New(2)) {
		fmt.Println("PASSED")
	} else {
		fmt.Println("FAILED")
	}
}

运行结果

tree.New(1) == tree.New(1): PASSED
tree.New(1) != tree.New(2): PASSED