Netlink program with Epoll system call
In this program, you are going to learn
How to communication between kernel and user space ?
How to create a socket ?
How to send a data ?
How to recv a data?
How to use user space APIs ?
How to use kernel space APIs ?
How to use socket APIs ?
Netlink is used to transfer information between the kernel and user-space processes.
Netlink is a datagram-oriented service. Both SOCK_RAW and SOCK_DGRAM are valid values for socket_type.
Let us answer few basic questions in this socket
What is the purpose of the socket(AF_NETLINK, SOCK_RAW, NETLINK_TESTFAMILY)
call?
See Answer
It creates a raw Netlink socket with a custom family identifier (NETLINK_TESTFAMILY)
.
Why choose AF_NETLINK
as the address family for the socket?
See Answer
AF_NETLINK
is specifically designed for Netlink communication between the Linux kernel and user-space.
What does the SOCK_RAW
type parameter indicate in the socket creation?
See Answer
It signifies that the socket operates in raw mode, providing direct access to Netlink messages.
How is NETLINK_TESTFAMILY
used in socket creation?
See Answer
It’s a custom Netlink family identifier, helping segregate messages for a specific application or purpose.
Can multiple sockets share the same Netlink family identifier (NETLINK_TESTFAMILY)?
See Answer
Yes, multiple sockets can use the same Netlink family identifier for communication.
How can the socket be utilized for communication with the Linux kernel?
See Answer
The socket can send and receive Netlink messages, facilitating communication between user-space and the kernel.
Is error checking necessary after creating the Netlink socket?
See Answer
Yes, it’s essential to check for errors after creating the socket to handle potential issues.
Is cleanup necessary after using the Netlink socket?
See Answer
Yes, it’s good practice to close the Netlink socket using close
when it’s no longer needed.
What role does the struct sockaddr_nl
play in Netlink socket creation?
See Answer
It provides address information for the Netlink socket, specifying details like family and process ID.
What happens if the Netlink buffer becomes full during message reception?
See Answer
The kernel may return an error, such as “No buffer space available” (ENOBUFS).
What is the role of the sendmsg
function in Netlink communication?
See Answer
It is used to send a message on a Netlink socket, providing flexibility in constructing and sending messages.
What is the role of the recvmsg
function in Netlink communication?
See Answer
It is used to recv a message on a Netlink socket, providing flexibility in recving and constructing messages.
How can a Netlink socket be used for kernel module communication?
See Answer
By defining a custom Netlink family, kernel modules can communicate with user-space applications.
What is the primary purpose of the epoll system call?
See Answer
To efficiently monitor multiple file descriptors for I/O events
What types of file descriptors can be monitored using epoll?
See Answer
sockets, files, timerfd, socketpair, message_queue, Namedpipes and shared_memory.
What data structure is used by epoll to store events?
See Answer
Hash table
How do you handle errors when using the epoll system call?
See Answer
Check the return value for -1 to detect errors, Use perror to print error messages.
How does epoll handle a set of file descriptors with different states (e.g., reading, writing, exception)?
See Answer
- Create the epoll Instance:
Before monitoring file descriptors, the application creates an epoll instance using the epoll_create system call.
int epoll_fd = epoll_create1(0);
- Register File Discriptors:
The application registers file descriptors with the epoll instance using the epoll_ctl system call. It specifies the file descriptor, the events it is interested in (EPOLLIN for readability, EPOLLOUT for writability, etc.), and a user-defined data associated with the file descriptor.
struct epoll_event event;
event.events = EPOLLIN | EPOLLOUT; // Interested in readability and writability
event.data.fd = my_file_descriptor; // File descriptor to monitor
epoll_ctl(epoll_fd, EPOLL_CTL_ADD, my_file_descriptor, &event);
- Wait for Events:
The application enters a loop where it calls epoll_wait to wait for events. This call blocks until one or more registered file descriptors become ready or until a timeout occurs.
#define MAX_EVENTS 10
struct epoll_event events[MAX_EVENTS];
int num_events = epoll_wait(epoll_fd, events, MAX_EVENTS, timeout_ms);
- Modify or Remove File Descriptors:
The application can dynamically modify or remove file descriptors from the epoll set using the epoll_ctl system call. For example, to modify events for an existing file descriptor:
struct epoll_event new_event;
new_event.events = EPOLLOUT; // Modify to be interested in writability
epoll_ctl(epoll_fd, EPOLL_CTL_MOD, my_file_descriptor, &new_event);
To remove a file descriptor from the epoll set:
epoll_ctl(epoll_fd, EPOLL_CTL_DEL, my_file_descriptor, NULL);
How does epoll Checking Ready File Descriptors?
See Answer
After epoll_wait returns, the application iterates through the returned events to identify which file descriptors are ready and for what types of events.
for (int i = 0; i < num_events; ++i) {
if (events[i].events & EPOLLIN) {
// File descriptor i is ready for reading
}
if (events[i].events & EPOLLOUT) {
// File descriptor i is ready for writing
}
// Check other events if needed (e.g., EPOLLERR, EPOLLHUP)
}
What does it mean if epoll returns 0?
See Answer
No file descriptors are ready within the specified timeout.
To create a socket with
socket()
,
client_socket = socket(AF_NETLINK, SOCK_RAW, NETLINK_TESTFAMILY);
nl_pid field of the sockaddr_nl can be filled with the calling process’ own pid.
nlh->nlmsg_pid = getpid();
Sending a Netlink Message
In order to send a netlink message to the kernel or other user-space processes, another struct sockaddr_nl addr needs to be supplied as the destination address, the same as sending a packet with sendmsg(). If the message is destined for the kernel, both nl_pid and nl_groups should be supplied with 0.
addr.nl_pid = 0;
addr.nl_groups = 0;
struct msghdr msg;
msg.msg_name = (void * ) &addr;
msg.msg_namelen = sizeof(addr);
The netlink socket requires its own message header as well. This is for providing a common ground for netlink messages of all protocol types. Because the Linux kernel netlink core assumes the existence of the following header in each netlink message, an application must supply this header in each netlink message it sends:
struct nlmsghdr * nlh = (struct nlmsghdr * ) malloc(NLMSG_SPACE(MAX_PAYLOAD));
nlh->nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD);
nlh->nlmsg_pid = getpid();
nlh->nlmsg_flags = 0;
A netlink message thus consists of nlmsghdr and the message payload. Once a message has been entered, it enters a buffer pointed to by the nlh pointer. We also can send the message to the struct msghdr msg:
struct iovec iov;
iov.iov_base = (void * ) nlh;
iov.iov_len = nlh->nlmsg_len;
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
After the above steps, a call to
sendmsg()
kicks out the netlink message:
sendmsg(client_socket, &msg, 0);
epoll_create1()
creating an epoll instance using epoll_create1, The size parameter is an advisory hint for the kernel regarding the number of file descriptors expected to be monitored, For example,
epoll_fd = epoll_create1(0));
epoll_ctl()
After creating an epoll instance, file descriptors are added to it using epoll_ctl, For example,
ret = epoll_ctl(epoll_fd, EPOLL_CTL_ADD, client_socket, &event);
epoll_wait()
The application then enters a loop where it waits for events using epoll_wait, For example,
ret = epoll_wait(epoll_fd, events, MAX_EVENTS, -1);
recvmsg
used for Receiving Netlink Messages,
recvmsg(client_socket, &msg, 0);
close
is used to close the socket To free up system resources associated with the socket. For example,
(void)close(client_socket);
See the full program below,
#include <linux/netlink.h>
#include <sys/socket.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <signal.h>
#include <sys/epoll.h>
#define NETLINK_TESTFAMILY 25
#define MAX_PAYLOAD 1024
#define MAX_EVENTS 2
struct msghdr msg;
int client_socket;
int epoll_fd;
static void sigint_handler(
int signo)
{
(void)close(client_socket);
(void)close(epoll_fd);
sleep(2);
(void)printf("Caught sigINT!\n");
exit(EXIT_SUCCESS);
}
void register_signal_handler(
int signum,
void (*handler)(int))
{
if (signal(signum, handler) ==
SIG_ERR) {
printf("Cannot handle signal\n");
exit(EXIT_FAILURE);
}
}
struct nlmsghdr *send_message(
int client_socket,
struct sockaddr_nl addr)
{
struct iovec iov;
struct nlmsghdr *nlh = (
struct nlmsghdr *) malloc(
NLMSG_SPACE(MAX_PAYLOAD));
memset(nlh, 0,
NLMSG_SPACE(MAX_PAYLOAD));
nlh->nlmsg_len = NLMSG_SPACE(
MAX_PAYLOAD);
nlh->nlmsg_pid = getpid();
nlh->nlmsg_flags = 0;
strcpy((char *)
NLMSG_DATA(nlh), "Hello");
memset(&iov, 0, sizeof(iov));
iov.iov_base = (void *) nlh;
iov.iov_len = nlh->nlmsg_len;
memset(&msg, 0, sizeof(msg));
msg.msg_name = (void *) &addr;
msg.msg_namelen = sizeof(addr);
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
printf("Sending message to kernel\n");
printf("-------------------------\n");
sendmsg(client_socket, &msg, 0);
printf("Sent message: %s\n\n",
(char *)NLMSG_DATA(nlh));
return nlh;
}
int main()
{
int len, ret;
int ready_fds;
struct nlmsghdr *nlh;
struct sockaddr_nl addr;
struct epoll_event
events[MAX_EVENTS];
struct epoll_event event;
register_signal_handler(SIGINT,
sigint_handler);
client_socket = socket(
AF_NETLINK, SOCK_RAW,
NETLINK_TESTFAMILY);
if (client_socket == -1) {
perror("socket");
return -1;
}
memset(&addr, 0, sizeof(addr));
addr.nl_family = AF_NETLINK;
addr.nl_pid = 0; // For Linux kernel
addr.nl_groups = 0;
epoll_fd = epoll_create1(0);
if (epoll_fd < 0) {
perror("Epoll creation failed");
exit(EXIT_FAILURE);
}
event.events = EPOLLIN;
event.data.fd = client_socket;
ret = epoll_ctl(epoll_fd,
EPOLL_CTL_ADD, client_socket, &event);
if (ret < 0) {
perror("Epoll_ctl failed");
exit(EXIT_FAILURE);
}
while (1) {
nlh = send_message(
client_socket, addr);
printf("sent successful\n");
ready_fds = epoll_wait(epoll_fd,
events, MAX_EVENTS, -1);
if (ready_fds < 0) {
perror("Epoll wait failed");
exit(EXIT_FAILURE);
}
if (events[0].data.fd == client_socket) {
nlh = send_message(
client_socket, addr);
len = recvmsg(client_socket,
&msg, 0);
if (len > 0) {
printf("Receving msg from kernel\n");
printf("------------------------\n");
printf("Received message: %s\n",
(char *)NLMSG_DATA(nlh));
} else {
if (errno == ENOBUFS) {
free(nlh);
} else {
perror("recv");
break;
}
}
}
}
(void)close(client_socket);
return 0;
}
nlmsg_new
using this create a netlink message.
struct sk_buff * skb_out;
skb_out = nlmsg_new(message_size, GFP_KERNEL);
nlmsg_put
used to populate the message with data.
struct nlmsghdr * nlh = (struct nlmsghdr * ) skb->data;
nlh = nlmsg_put(skb_out, 0, 0, NLMSG_DONE, message_size, 0);
nlmsg_unicast
used to send the message to the user space application.
result = nlmsg_unicast(socket, skb_out, pid);
netlink_kernel_create
used to create a Netlink socket in the kernel.
socket = netlink_kernel_create(&init_net, NETLINK_TESTFAMILY, &config);
wake_up_interruptible
is used to wake up any threads that are waiting on the specified wait queuewq
. When a thread sets a condition and callswake_up_interruptible(&wq)
, it signals to other threads waiting on the same condition that they can proceed.
wake_up_interruptible(&wq);
init_completion
used to initialize the dynamically created completion variable. For example,
init_completion(&comp);
kthread_run
is used to create and start a kernel thread. For example,
kthread = kthread_run(thread_func, NULL, "my_thread");
kthread_stop
is used to stop and clean up a kernel thread created with kthread_run. For example,
kthread_stop(kthread);
In this example,
kthread_run
creates a kernel thread to executethread_func
andkthread_stop
is used to stop and clean up the thread when it’s no longer needed. The thread checkskthread_shoul_stop()
to determine when it should exit.complete
used to signal any waiting tasks to wake up. For example,
complete(&comp);
wait_for_completion
used to waits for the given completion variable to be signaled. FOr example,
wait_for_completion(&comp);
netlink_kernel_release
used to release the netlink socket created withnetlink_kernel_create
.
netlink_kernel_release(socket);
See the full program below,
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/netlink.h>
#include <linux/kthread.h>
#include <linux/completion.h>
#include <net/netlink.h>
#include <net/net_namespace.h>
#include <linux/delay.h>
#define NETLINK_TESTFAMILY 25
#define NETLINK_MYGROUP 2
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Linux_usr");
MODULE_DESCRIPTION("Netlink - Unicast");
struct sock *socket;
static struct completion comp;
static struct task_struct *thread1;
DECLARE_WAIT_QUEUE_HEAD(wq);
static void test_nl_receive_message(
struct sk_buff *skb)
{
struct nlmsghdr *nlh =
(struct nlmsghdr *) skb->data;
pid_t pid = nlh->nlmsg_pid;
int result;
char *message;
size_t message_size;
struct sk_buff *skb_out;
pr_info("Entering: %s\n", __func__);
pr_info("kernel Received message: %s\n",
(char *) nlmsg_data(nlh));
message = "Hello from kernel unicast";
message_size = strlen(message) + 1;
skb_out = nlmsg_new(message_size,
GFP_KERNEL);
if (!skb_out) {
pr_err("Failed to allocate a new skb\n");
return;
}
nlh = nlmsg_put(skb_out,
0, 0, NLMSG_DONE, message_size, 0);
NETLINK_CB(skb_out).dst_group = 0;
strncpy(nlmsg_data(nlh),
message, message_size);
result = nlmsg_unicast(socket,
skb_out, pid);
pr_info("Sent message: %s\n",
(char *)nlmsg_data(nlh));
wake_up_interruptible(&wq);
}
static int thread_fun1(void *data)
{
struct netlink_kernel_cfg config = {
.input = test_nl_receive_message,
};
socket = netlink_kernel_create(
&init_net,
NETLINK_TESTFAMILY, &config);
if (socket == NULL)
return -1;
pr_info("Netlink initialized\n");
while (!kthread_should_stop()) {
pr_info("Thread 1 is running\n");
wake_up_interruptible(&wq);
msleep(1000);
}
complete(&comp);
return 0;
}
static int __init test_init(void)
{
pr_info("Driver Loaded\n");
init_completion(&comp);
thread1 = kthread_run(
thread_fun1, NULL, "thread1");
if (IS_ERR(thread1)) {
pr_alert("Failed to create thraed1");
return PTR_ERR(thread1);
}
return 0;
}
static void __exit test_exit(void)
{
if (socket)
netlink_kernel_release(socket);
kthread_stop(thread1);
wait_for_completion(&comp);
pr_info("Netlink released\n");
}
module_init(test_init);
module_exit(test_exit);
obj-m += netlink_kernel.o
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
client:
gcc nl_user.c -o nl_user
clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
rm nl_user
$ make all
$ sudo insmod ./netlink_kernel.ko
$ gcc -o nl_user nl_user.c
$ ./nl_user
Sending message to kernel
-------------------------
Sent message: Hello
Receving msg from kernel
------------------------
Received message: Hello from kernel unicast
Sending message to kernel
-------------------------
Sent message: Hello
Receving msg from kernel
------------------------
Received message: Hello from kernel unicast
Sending message to kernel
-------------------------
Sent message: Hello
Receving msg from kernel
------------------------
Received message: Hello from kernel unicast
Sending message to kernel
-------------------------
Sent message: Hello
Receving msg from kernel
------------------------
Received message: Hello from kernel unicast
Sending message to kernel
-------------------------
Sent message: Hello
Receving msg from kernel
------------------------
Received message: Hello from kernel unicast
Sending message to kernel
-------------------------
Sent message: Hello
Receving msg from kernel
------------------------
Received message: Hello from kernel unicast
Sending message to kernel
-------------------------
Sent message: Hello
Receving msg from kernel
------------------------
Received message: Hello from kernel unicast
Sending message to kernel
-------------------------
Sent message: Hello
Receving msg from kernel
------------------------
Received message: Hello from kernel unicast
Sending message to kernel
-------------------------
Sent message: Hello
Receving msg from kernel
------------------------
Received message: Hello from kernel unicast
Sending message to kernel
-------------------------
Sent message: Hello
Receving msg from kernel
------------------------
Received message: Hello from kernel unicast
^CCaught sigINT!
$ sudo rmmod netlink_kernel
$ dmesg
[45587.442654] Driver Loaded
[45587.442796] Netlink initialized
[45587.442799] Thread 1 is running
[45595.620384] Thread 1 is running
[45596.644279] Thread 1 is running
[45597.668497] Thread 1 is running
[45598.696484] Thread 1 is running
[45599.369656] Entering: test_nl_receive_message
[45599.369668] kernel Received message: Hello
[45599.369675] Sent message: Hello from kernel unicast
[45599.369741] Entering: test_nl_receive_message
[45599.369744] kernel Received message: Hello
[45599.369748] Sent message: Hello from kernel unicast
[45599.369802] Entering: test_nl_receive_message
[45599.369804] kernel Received message: Hello
[45599.369808] Sent message: Hello from kernel unicast
[45599.369853] Entering: test_nl_receive_message
[45599.369856] kernel Received message: Hello
[45599.369860] Sent message: Hello from kernel unicast
[45599.369892] Entering: test_nl_receive_message
[45599.369895] kernel Received message: Hello
[45599.369899] Sent message: Hello from kernel unicast
[45599.369944] Entering: test_nl_receive_message
[45599.369947] kernel Received message: Hello
[45599.369950] Sent message: Hello from kernel unicast
[45599.369999] Entering: test_nl_receive_message
[45599.370002] kernel Received message: Hello
[45599.370006] Sent message: Hello from kernel unicast
[45599.370048] Entering: test_nl_receive_message
[45599.370051] kernel Received message: Hello
[45599.370054] Sent message: Hello from kernel unicast
[45599.370102] Entering: test_nl_receive_message
[45599.370105] kernel Received message: Hello
[45599.370108] Sent message: Hello from kernel unicast
[45599.370146] Entering: test_nl_receive_message
[45599.370149] kernel Received message: Hello
[45599.370153] Sent message: Hello from kernel unicast
[45599.716442] Thread 1 is running
[45600.740260] Thread 1 is running
[45601.764221] Thread 1 is running
[45602.788468] Thread 1 is running
[45603.812238] Thread 1 is running
[45604.836370] Thread 1 is running
[45622.244072] Thread 1 is running
[45623.268359] Netlink released
Default Domain:
By default, the socket is configured to work in the
AF_NETLINK
domain, handling all types of network data.
Additional Domain Support:
We expand the socket’s capabilities to also function in the
PF_NETLINK
domain, allowing it to operate similarly toAF_NETLINK
.
Socket Creation:
We set up a network connection point known as a socket using
socket(PF_NETLINK, SOCK_RAW, NETLINK_TESTFAMILY)
.
Working Scenario:
Despite the change in domain to
PF_NETLINK
, the socket continues to operate the same way, handling general network data.
User Space API |
Learning |
---|---|
socket |
To create a socket |
epoll |
handles a set of file descriptors with different states, such as reading, writing, and exceptions, by using the struct epoll_event structure and the associated event flags.. |
sendmsg |
To send netlink message |
recvmsg |
To receive netlink message |
Kernel Space API |
Learning |
---|---|
nlmsg_new |
To create a netlink message |
nlmsg_put |
TO populate the message |
nlmsg_unicast |
To send the message to the user space application |
netlink_kernel_create |
To create a Netlink socket in the kernel |
netlink_kernel_release |
To release the netlink socket |
wake_up_interruptible |
To wake up any threads that are waiting on the specified wait queue |
kthread_run |
Create and wake a thread |
kthread_should_stop |
To determine when thread should exit |
kthread_stop |
Stop a thread created by kthread_create |
init_completion |
Initializes the given dynamically created completion variable |
complete |
Signals any waiting tasks to wake up |
wait_for_completion |
Waits for the given completion variable to be signaled |
Previous topic
Current topic
Other IPCs